1 Introduction

The institutional tradition of law differs between European Union countries, affecting the court’s adjudications in similar cases. This relationship stays in line with path dependence hypothesis, as defined by Mahoney as: ‘initial steps in a particular direction inducing further movement in the same direction such that over time, it becomes difficult or impossible to reverse direction’ (Mahoney, 2000: p.512). Beside the path dependence stemming from legal tradition prevailing in the territory of a given court’s jurisprudence, we have found some evidence on the diversification of court decisions implied by the experience of the judges and the courts in resolving the given type of cases (specialization) and preference (bias) for one of the parties. These two latter reasons are well recognized in the literature of judicial-decision-making, so we decided to put special attention on the first reason – the path dependence.

The basic problem in investigating law traditions is the homogeneity of different law systems within a country. Therefore, the impact of different law systems can be captured mainly in international courts with judges representing different countries. However, this type of analysis is rarely up to date. For example, Zhang et al. (2018) found that if a rapporteur judge before the General Court comes from a country with a strong French administrative law tradition then the decisions are more frequently made in favour of the European Commission.

We decided to investigate the path dependence on a sample of administrative adjudication cases from one country, but reveal the internal differentiation of legal tradition originating from the past. This condition is met for Poland which is a member of the EU. It represents the so-called continental law (which is not based on common law). Also, in the past, Poland was partitioned by three countries with different law regimes. Precisely, the contemporary territory of Poland, in the eighteenth and nineteenth century, was divided between Austria (in the South of the country), Germany (in the West and North of the country) and Russia (in the East and Centre of the country) and each of these countries has had a different legal system. These partitions left a significant mark not only on the legal rules but also on the functioning of the courts, judges, and sides of litigation. Legal rules are now formally uniform at the country level but the path dependence is still present at the provincial level. It creates a unique opportunity to examine the impact of institutional differences on the decisions issued by particular courts representing three different legal traditions.

We would like to test three hypotheses. The complex history of public administration and administrative courts in Poland divided by three Empires (Russia, Germany and Austria) triggered a decision-making path by administrative courts. These paths can be observed a hundred years later in a differentiation of verdicts of similar cases by court operating on different territories (Hypothesis 1). Moreover, the verdicts in a given territory are persistent and reluctant to the harmonization pressure exerted by the superior courts or other courts at the same level (Hypothesis 2). The path dependence of jurisprudence in the court’s area is so strong that it prevents the harmonization of verdicts in similar cases even if the cases are controversial and would require fast harmonization of judgments (Hypothesis 3).

We prove our hypothesis by the application of logistic regression and ordered logistic regression to 337 cases of erroneous documentation of mineral oil excise duty with a full set of data and separately for 464 cases, including the cases without a known value of the claim. The administrative cases are well suited to the investigation of the court decisions’ determinants because they are initiated by one party (complainant) against government authorities’ decisions which makes them relatively homogenous.

We focused on the cases where the court had to balance between the compliance with strict legal rules and serious consequences for one of the parties when the consequences are perceived as not proportional to the committed act. Such confusion can lead to different adjudications when the court is not bonded with the previous verdicts of the other courts in similar cases (like in the systems of continental law). If a penalty is severe and the possibility of punishment results from small mistakes on the part of the accused person, one can expect the courts to be reluctant in sentencing against that person even if the rules are clear and postulate severe penalty.

Our basic results are as follows:

  1. 1.

    The nineteenth century partitions of Poland between the three countries with different legal traditions affected the judicial decision-making of the courts. The court’s legal tradition (represented by a German dummy in the regressions) significantly affects the sentences in favour of strict implementation of legal rules against sentences based on ‘justice’ in comparison with the courts operating on the former territory of the Russian and Austrian partitions. The results for Russian and Austrian partitions reveal oposite signs in regressions indicating the greater chance for complainant’s win.

  2. 2.

    The bias for one of the parties is perfectly explained in our regression model by the variables used, so there is no systematic bias toward the tax authority or complainant. However, some courts are more likely to adjudicate in a manner increasing the revenue for the government. The preference for higher revenue arouses special concern because it can be indicative of prejudice in a court.

  3. 3.

    The value of the claim can positively affect the verdicts in favour of the taxpayer. This finding is also closely related to the observation that the court’s costs positively affect the verdicts. It confirms the hypothesis regarding reluctance of sentencing against the taxpayer in cases which can result in great harm.

  4. 4.

    The courts are more willing to follow the argumentation of the government party (for example the opinion of a tax-collecting agency) when they are less experienced in resolving the cases. This is further reinforced by the high workload. This is in line with the finding that judges decide quicker when validating an agency decision (Miles, 2012). Therefore, the specialization and the experience of a court are important in explaining the differences in verdicts.

This paper contributes to both judicial decision making and path dependency literature. Our findings are especially important if the cases or regulations are controversial and the individual judgments can harm one of the parties involved in a dispute resulting in the differentiation of verdicts. We confirm the role played by institutional factors in the judicial decision-making process. The path dependency seems to be strong (affects the verdicts and it is resistant to the attempts of harmonization) and long-lasting (remains valid even after one hundred years after the unification of law). This work concentrates on the empirical verification of the path dependence hypothesis on court cases, a subject which is rarely discussed in the literature.

The direct implication of this study is the postulate of greater specialization of the courts and their enlargement. The higher number of judges, as well as a higher experience in resolving similar cases, helps to mitigate the path dependence effect. When the number of resolved cases is small and the judges have no support from their colleagues in a small court, the verdicts are more diversified between the courts resulting in unequal treatment of the complainants. To overcome the effect of unequal case adjudication, external intervention is required, for example in the form of a binding interpretation of the law or a preliminary ruling of the Court Justice of the European Union (CJEU).

2 Literature review

Taking into account later considerations, one can distinguish four groups of important literature referring to the path dependency theory, the impact of legal tradition on adjudication, the decision-making theory related to institutional factors, and the theory of bias for one of the litigants involved in a sue.

2.1 Path dependence

In political sciences, path dependency is investigated as a strand of historical institutionalism (Kay, 2005). Path dependence theory explains the impact of early events in the history of a country, a social phenomenon, or organization on later events (Mahoney, 2000). Up until now, several attempts have been made to investigate the path dependency hypotheses in practice. For example, Yesilkagit and Jørgen (2010) proved that the design of regulatory agencies created between 1945 and 2000 in Sweden, the Netherlands, and Denmark are a function of path dependency and national administrative traditions but it does not depend on political factors. Sinclair and Whitford (2013), testing path dependency, showed that citizen ideology and political culture measured in 1965 affected the choice of an organizational structure of agencies for health and environment in the United States in 1999. Armingeon and Careja (2008) proved that 27 political systems in post-communist countries, including Poland, have changed in a limited way during the period 1990–2002. The changes were more related to the path-dependent determinants than factors of domestic politics or socio‐economic resources and developments. Grosfeld and Zhuravskaya (2015) proved, that differences in intensity of religious practices and in beliefs in democratic ideals, i.e., democratic capital in contemporary Poland are the result of the division of this country among three empires (Austria, Prussia, and Russia) before 1918. They also proved that these cultural empire legacies have an effect on the political outcomes in contemporary Poland. Bukowski (2019) demonstrated the long-lasting effect of the nineteenth century education systems in Poland, which were imposed by three empires, on modern education outcomes.

The path dependency in law is perceived as ‘a kind of irrationality closely connected with inconsistency’ (Kornhauser & Sager, 1986: 107). It manifests in the idea that earlier decisions must be accepted as binding on courts (Fallon, 2002). Most of the papers concerning path dependence in law are concerned with: the empirical verification of precedents’ impact on former adjudication (Schmidt, 2011; Sweet, 2002), the evolution of legal change (Fon et al., 2005; Marciano & Khalil, 2012), circumstances of passing the law (De Ruysscher, 2018) or judiciary system reform (Piana, 2009). Our study involves path dependence in administrative courts’ decision-making taken from the system of continental law where different lines of authority occur (and the role of previous judgments is limited). We postulate that courts’ decisions are not optimal because of the legal tradition bias. This bias manifests in the court’s decision-making process and remains especially evident when cases are complicated.

2.2 Legal tradition

According to Aden (2015: p. 19) ‘traditions include practices that have been established a long time ago and maintained at least during a certain period of time, often transferred from generation to generation’. One of the traditions is administrative law tradition or the legal tradition of administrative procedure, investigated by such authors as Nolte (1994), Painter and Peters (2010) and Pünder (2013). Going further, it is very important to determine whether there is something like ‘German administrative legal tradition.’ We believe that this is the case. One of the features of this tradition is a strict treatment of legal provisions, sometimes regardless of the negative consequences of their use.

Following Painter and Peters (2010), the continental administration law in Europe can be divided into three groups: the French/Napoleonic, the Germanic and the Scandinavian tradition. Nolte (1994: p. 196) analysed the principles used in German and European administrative law (principle of proportionality, principle of equality and principle of legitimate expectations) to find the impact of German administrative law on European administrative law. He pointed out that the principle of proportionality in the German tradition of law includes three parts: ‘proportionality,’ ‘capacity of furthering an aim’ and ‘necessity’. The latter gives a lower margin of appreciation for administration in German law than in European law. Particularly, the German courts consider administrative discretion only when it is expressed explicitly. If it is not explicit, there is only one correct solution to the case. This observation is especially important when investigating the adjudication differences in administrative courts because the strict application of rules is more probable when a verdict is made under the German tradition of law than on the basis of other law traditions.

We can find evidence of such a preference if we analyse the tradition of German administrative law from the period before World War II. This earlier period refers to the Prussian legal tradition based on the functioning of the Prussian administration, which had a decisive influence on the administrative law of Germany after 1871. The administrative judiciary was introduced in 1872 and extended later to all Prussian provinces, based on the law on general administration of the country of July 30, 1883 and the new act on the competence of administrative and judicial-administrative authorities of August 1, 1883. The organization of provincial administrative courts remained unchanged in the former German (Prussian) partition of Poland after the First World War (Tarnowska, 2006) but the organization of higher court followed the regulation of Austrian Administrative Tribunal (Verwaltungsgerichtshof). This is important because the latter focused on the protection of individual rights while the German courts were mainly focused on legal control of administrative action. The German organization was preserved in lower courts on the territory of the former German partition and at least partially in new territories attached to Poland after the Second World War.

Pünder (2013: p. 945) investigated the German administrative procedure from a comparative perspective. He noted ‘administrative procedure law was viewed as an economic burden to agencies and affected private entities’ enhancing the effort to relax the formal procedural rules. This perception is especially interesting because it hints that the former procedure was relatively rigid. Pünder also questioned the lower importance of a participatory approach to administrative actions in contemporary Germany but he refers to the situation after the Second World War. The previous situation was quite different because procedural requirements for administration were not perceived as necessary and were replaced by unwritten legal principles.

The earlier legal tradition was based on the right of the ‘government to take all measures that it deemed necessary and urgent in view of the distress of people and state’ (Kischel, 1994: p. 229). This statement provides the acceptance for the delegation of rules on administration and henceforth, justified the adjudication in line with the administrative point of view as an adequate measure necessary for the assumed purposes. This is justified by the perceived role of the public administration to take necessary actions even if those actions are not directly regulated. Rose-Ackerman (1995) identified that German administrative law focuses on individuals' complaints against the state for violating their rights but weakly balances the desires and expertise of conflicting interest groups, contrary to the American law tradition. Democratic legitimacy and administrative procedure are of less concern, especially because there is no clear delimitation between regulation and enforcement. Similarly, the legislative and executive branches are not fully separated. Therefore, the courts operating under the German law tradition are less inclined to take into account the preferences of other stakeholders, resulting in arbitrary decisions which are most likely in favour of the administrative authorities. If the presented suppositions are true then we should find the differences in the adjudication of different administrative courts with respect to the territory they operate on, with the courts within the German tradition of law preferring the application of strict rules.

Austrian authorities largely respected the identity and local differences of various parts of the Habsburg Empire (Ingrao, 2000). This respect was mutual. Sieghart wrote in 1932, that the Austrian bureaucracy was well respected by the population because of its reliability. Austrian tradition was very importatnt for the organization of the new administration in Poland after regaining independence in 1918. Large parts of the previous administrators, such as the Habsburg-trained-local-judiciary stayed unchanged because of their professional competencies (Kraft, 2002) and their experience was used for the organization of administration in whole country. This tradition was also longlasting because Polish administrative structures were dissolved on the territory of Austrian partition after 1772 and an administration and judiciary were introduced on the model of the rest of the Habsburg Empire. Local officials were also trained according to patterns from Habsburg’s administration (Mark, 1994). On the provisions of the Constitution of March 17, 1921 and the Act of August 3, 1922, the Supreme Administrative Tribunal, the first modern administrative court model in Polish history, was established—on the pattern of Austrian Administrative Tribunal (Malec, 1999). The Supreme Administrative Tribunal was active to the beginning of Second World War in 1939. Its tradition is continued now by the Supreme Administrative Court functioning since 1980. The administrative procedure in Poland was codified after the Second World War but this process was gradual. The administrative procedure was unified in 1960 while the regulation of procedure before the administrative courts was—several years later—in 2002. The slow pace of regulatory changes and the lack of central administrative court by nearly 40 years contributed to the persistence of regional differences in administration.

It should be noted that the part of the contemporary territory of Poland functioned as a administrative judiciary resembling the French system of administrative courts. In the Duchy of Warsaw 1809–1815, administrative disputes were settled in the first instance by departmental and prefectural councils, while in the second instance by the Council of State (Cichoń, 2013; Malec, 1999; Witkowski, 1984). After Napoleon's defeat, in the Kingdom of Poland, which was formed mostly from the lands of the Duchy of Warsaw, these traditions were continued with some modifications in the years 1817–1830, 1836–1841 and 1861–1867. However, after 1867, the unification process aimed at eliminating the existing jurisdictional differences of the Kingdom and Russia and supercede it with the structures and procedures in force in other territories of Russia (Wiązek, 2014). Therefore we omit this legal tradition in our research because it was replaced by the system characteristic of Russia. The administrative courts did not function in the Russia. Instead, there was a system of complaints from citizens directed to different administrative units (Panova, 2016). The reforms of the Russian revolutionary government of 1917 did not affect Russian partition because it had been under control of Germany and Austria since the end of 1915.

2.3 Decision-making theory

The attitudes affecting the court’s decisions are well recognized in the contemporary literature on judicial decision making (Segal & Spaeth, 1993; Weinshall et al., 2018; Weinshall-Margel, 2011). Most of the studies concentrate on ideological attitudes but there are some papers emphasizing the role of strategic considerations (Epstein & Knight, 1998; Maltzman et al., 2000) or institutional factors (Gillman & Clayton, 1999; Weinshall-Margel, 2011). The latter postulates acting in accordance with the institutional norms of the court rather than referring to the ideological position of judges. Two things are especially important in this context: the specialization of judges (Cheng, 2008; Curry & Miller, 2015) and the impact of specialists on the consistency of the court’s verdicts (Curry & Miller, 2016). We follow the line of research investigating the impact of institutional factors on the court’s decisions.

The problem of public interest preference in courts has been pointed out first in American literature by Mashaw (1976) with relation to the Mathews v. Eldridge’s case sentence. The sentence postulates the assessment of two interests in adjudication—public and private—which should be equalized. In the literature on the subject, the attention is especially focused on the proper calculation of the costs and gains of both parties involved in the disputes.

2.4 Court’s bias to litigants

The empirical analysis of bias involves the court’s preference towards the stronger litigant in administrative cases (He & Su, 2013) or direct preference for the government or the complainant (Amaral-Garcia & Garoupa, 2016; Chen et al., 2015; Finnegan et al., 2012; Ginsburg & Wright, 2013). In this study, we are going to follow Johnson (2013: p. 253) who argues that: ‘The studies that may seem to hint at bias typically do not adequately control for differences between the types of cases and the types of taxpayers appearing in the Tax Court versus district court.’Footnote 1 Finnegan et al. (2012: pp. 16–42) investigated the low winning rate of taxpayers in the Small Case Division of the United States Tax Court. The Court preferred the government party in almost 90 per cent of the cases because ‘the taxpayers did not provide adequate substantiation or blatantly underreported income.’ Despite this, the sources of preferences in the specialized courts (like the United States Tax Court) remain unidentified but they can involve such factors as ideology, institutional design and structure, identification with statutory scheme, the type of the litigant and characteristics of the judges with special emphasis on previous experience in a tax collection agency or tax expertise (Howard, 2009).

Amaral-Garcia and Garoupa (2016) confirmed the inefficiency of the administrative courts caused by a lack of true independence. The authors investigated the preference of administrative courts to favour the government in cases of medical malpractices in Spain where the government was one of the involved parties. Other studies referred to the different treatment of private and government litigants by the courts. For example, Bentata and Hiriart (20152017) highlighted the possible bias of the lower administrative courts in favour of the government. The authors investigated the determinants of pro-defendant and reversal decisions in environmental cases appealed in the Supreme Civil Court or the Supreme Administrative Tribunal in France. They found that the defendant party was subject to harsher treatment in the Supreme Administrative Tribunal than in the Supreme Civil Court and there was a lower probability of the defendant winning the case in the administrative courts. Sometimes the preference for a government can be induced by the self-identification of the judges as a part of a government (Chen et al., 2015).

Up until now, the analysis of determinants of the administrative litigation cases was not frequent in literature. For example Zhou et al. (2017) studied the determinants of land acquisition and resettlement in China. It should be noted, that most of the papers concern the complainant’s party and the characteristics of the case but not the characteristics of the court. Contrarily, our study focuses on the courts and the previous results of litigation with the same complainant. Similarly, we focus on the impact of the legal tradition and the use of references to other sentences to assess the reluctance for the unification of the verdicts.

3 Administrative justice in Poland

Administrative justice in Poland is conducted by the administrative courts through the control of public administration activities and the settlement of disputes over competence and jurisdiction. Administrative courts are special courts (Fig. 1). The judicial power is divided into 16 provincial courts and one Supreme Administrative Court, which is an appeal court. The judges of administrative courts are appointed by the President of the Republic of Poland upon the request of the National Council of the Judiciary. To be appointed a judge of a provincial administrative court, a number of conditions have to be met. These conditions are defined in the Act of 25th July 2002 the Law on the System of Administrative Courts. Among them, two are particularly important: the outstanding level of knowledge in the field of public administration and administrative law and other legal disciplines related to the functioning of public administration authorities and staying in service or being employed for at least 8 years as a judge, prosecutor, president, vice-president or counsel of the General Counsel to the Republic of Poland, lawyer, legal counsel or notary public or staying in civil service for at least 10 years in public institutions on positions connected with the application or making of administrative law.

Fig. 1
figure 1

Judicial systems in Poland

According to the act of 30th August 2002 Law on Proceedings before Administrative Courts, the administrative court adjudicates in a panel of three judges in an open session. However, for an administrative court sitting in camera, cases are adjudicated by a single judge (Article 16) in minor cases related to matters concerning procedures. The court determines a case within its competence limits if it is not bound by the charges and requests of the complaint and the legal basis invoked (Article 134(1)). The court issues a judgment after the deliberation of judges which is held in private. The deliberation and voting on the decision is closed to the public and the waiving of secrecy is not allowed. The only exception has been introduced for a dissenting opinion (Article 137).

4 Characteristic of the cases

Mineral gas oil can be used for heating purposes or as a fuel for vehicles. However, different rates of excise duty are applied depending on its use. The duty rate for heating oil in Poland is much lower than the duty imposed on motor oil (about eight times lower), inducing moral hazard and encouraging tax fraudsters. In order to prevent fraud, buyers in Poland must sign affirmation and provide identification data to the oil providers, whereas the latter is responsible for the correctness of the sales documentation. Until 2016, the errors in identification or signatures often resulted in the rejection of the tax declaration and the oil provider was charged the excise duty at the value appropriate to that of the motor oil. Since 2016, according to the preliminary ruling of the CJEU (C-418/14), the rules have been changed in favour of the taxpayers.

These practices triggered much litigation between tax authorities and taxpayers which were the subject of rulings in administrative courts. The verdicts of the courts were ambiguous because, on the one hand, there was a suspicion that the documentation errors were intentional and that they were deliberately deceptive to the detriment of the government’s budget but on the other hand, a negative judgment might lead to an excessive penalization based on petty errors, in most cases committed by fuel station employees. The payment of duty proper to the fuel oil had serious consequences for the taxpayers, like financial loss or even bankruptcy.

The disputes between the taxpayers and the tax authorities are settled by administrative courts because the imposition of excise duty is an administrative decision. It means that the taxpayer is always the complainant party. If the taxpayer accepts the decision of the tax authority, it does not bring the action before the court and the decision is valid. The provincial administrative court is as competent as the first instance. If the party is dissatisfied with the court’s decision then it may appeal to the SAC. If the appeal is in favour of the complainant, the case is once again referred to the administrative court, which issues a new verdict. This procedure is time-consuming. The first sentence requires several months on average, appealing to SAC can extend this to several years.

4.1 Data and variables

The study applies logistic regression to positive and negative judgments about the defective heating-oil-sales-documentation from all 16 provincial administrative courts in Poland. It should be noted that the cases in the sample set were all preselected so that they are homogenous regarding the type of solved problem. Our original database includes 1040 single verdicts regarding improper documentation and a known value of the claim chosen from all 3838 cases involving improper taxation of oil sold. However, because some of the cases refer to the same complainant with the same type of verdict and the same date of judgment, we decided to aggregate them to 464 cases in total and 337 cases with known costs of the court’s proceedings.

The data were obtained from the Supreme Administrative Court’s database of case law. The database includes all of the verdicts of the administrative courts since 2004—the year of their establishement in Poland. We decided to use only the cases which were disputed for the first time. The appealed cases are predetermined by the SAC’s ruling, making the subsequent verdicts dependent. The financial data and the data about the employed staff or the number of cases were taken directly from the administrative courts.

The time span of the data in our research has been restricted to the period 2009–2016. The cases prior to 2009 are rare and refer to the former regulation of the Act on excise duty. This former regulation, according to the court's judgments, exceeded the statutory authorization. Therefore, it highly contributed to the complainant’s wins before administrative courts due to formal reasons, without relation to the case merits. Starting from the second half of 2016, the administrative court’s rulings changed according to the preliminary ruling of the CJEU (C-418/14). From this time on only the real purpose of the oil usage is considered and the documentation errors are treated as irrelevant if the heating purpose of the oil is evident. This new approach stimulated the abandonment of lawsuits by the taxpayers and reduced the number of such cases settled in administrative courts.

Logistic regression was applied because in administrative cases a partial win/loss is not possible and the court’s sentence strictly refers to the acceptation of the previous decision of an administrative authority. Therefore, the decision can only be valid or invalid, determining the dependent variable to be zero or one. However, sometimes the win is determined by procedural reasons (e.g. errors of tax authorities in the procedure of tax collection or incorrect verification of the defective tax documentation evidence) but not by the merit of the case. Therefore, we have decided to check our results by an ordered logistic regression when the formal winning is distinguished from material winning and loss. It is an equivalent of a partially valid decision in civil cases. This approach is becoming more popular in litigation analyses of different kinds (Best et al., 2011; He & Su, 2013) and sometimes with relation to administrative courts (Taratoot & Howard, 2001).

The dependent variable in logistic regression is a dummy taking the value of 1 when the taxpayer won the dispute or zero when the fiscal authority won. However, sometimes the win of the taxpayer means different things. It can be: 1) the repeal of some acts under consideration, 2) assessment of lower excise duty imposed on the taxpayer but higher than the one postulated by the complainant or 3) the repeat of some part of the procedure before the tax authorities. Such a ‘partially positive’ judgment delays the final resolution for the future when the probability of a negative decision of the tax collection agency is less probable (because of the preliminary rule of the CJEU and growing objections to the legitimacy of such interpretation of the law). Nevertheless, it justifies the treatment of such decisions as a win of the taxpayer. Alternatively, we describe such a decision in the ordered logistic regression as a partial win with a value of one. Then the full win has been decoded as two and a loss is marked as zero.

The use of a dependent variable determining the win or loss of the taxpayer raises the question about the potential bias of the verdicts as a consequence of the selection of the sued cases. In particular, it directly applies to the postulate of Priest and Klein (1984: 5), according to which there is a tendency for the plaintiff to win 50 per cent of the cases. This is because if one party is more successful in a given type of case, the other party becomes less willing to start the dispute because of the fear of losing. However, it is not necessarily true because it depends on the relative strength of selection effect and characteristics of the judiciary (Helland et al., 2018; Klerman & Lee, 2014). Moreover, the size of the potential victory can encourage a plaintiff to sue when the probability of winning is low but the cost of the trial is not too high.

The type of sentence can be affected by at least three groups of factors. The first group describes the characteristic of the case, the second covers the operational aspects of the courts’ processing, while the last group refers to the judiciary tradition on the territorial jurisdiction of a given court.

First, we used the value of the claim (value_of_claim) defined as the monetary value of the dispute divided by 1 million in cases: adjudicated in the same date, for the same complainant and the same type of the outcome of the decision. This variable checks the preferences in judgments to the value of the claim (because it is a proxy of the harm which the negative verdict can impose on the complainant). It should be emphasized that in practice many of the cases in one court refer to the same complainant (for example if the errors in documentation were found in several consecutive months and tax declarations) and some cases of erroneous documentation are combined by the court into a single high-value case. Sometimes the decision about the split of the cases is made by the complainant to extend the time of the settlement (for example to get some time to collect funds for the tax payment or to save on the court’s cost proceedings) or to sue only the first administrative decision to recognize the court’s attitude to the complainant. In general, it prompts the complainant to bring the cases one by one.

Second, we capture the costs of the court’s proceeding by the variable direct_court_costs. It is a proxy for the true cost of litigation because it does not cover all costs of the dispute but only the part repaid to the winner after a positive verdict. In fact, the true financial cost of the suit to the litigants is higher. Unfortunately, the information about these costs is incomplete. It can affect the results of the regressions and it inclines us to estimate the third regression without features of the cases referring to the value of the claim and costs.

Third, we decided to distinguish cases with explicit cheating from other cases. The explicit cheating cases involve the situations when: the evidence included data of non-existing buyers (e.g. a dead person), the falsified amount of purchased oil, and falsified signatures not confirmed by the buyer. For this reason, each case was thoroughly verified for the explicit cheating characteristic. The cases with evident fraud are identified by the value one for the variable explicit_cheating and zero for the cases where there is no explicit cheating. It is very important to prevent the problem of misspecification of cases, leading to incorrect conclusions about preferences in the adjudicated cases. The negative features of the case and the complainant can adversely affect the court’s verdicts (Volkov, 2016) and there are the ‘salient aspects of the case’ affecting the court’s decision (Bordalo et al., 2015).

The next group of variables describes the performance and operational characteristic of the administrative courts. The first variable, judges_load, measures the workload of the judges in a court. This variable is the annual number of new cases in the court divided by the number of judges. The variable has been added because the administrative courts in Poland differ in the number of processed cases (from 4.47 new cases per judge to 45.31 and the number of judges from 13 to 149). It should be noted that the total number of cases adjudicated by one judge is three times higher because in each case there are three judges involved in a verdict. This can affect the sentences at least fourfold. It can make the judges tired and less eager to thoroughly investigate cases causing the sentences to be more diversified or it can increase the experience of judges facilitating the decision making process and unification of the verdicts. The literature indicates also that the workload is negatively related to the dissent of judges which can make the verdict more homogenous in a given court (Epstein et al., 2011; Narayan & Smyth, 2007). Finally, the process can be facilitated by the increase of formal rationality in a court (rationality referring to the means and procedures) at the expense of substantive rationality (rationality referring to the value of ends or outcomes).

The second variable, effectiveness, is constructed as a ratio of the finished cases to the total number of new cases in a given year. It measures the degree of purpose fulfilment in a court which is marked by the termination of the cases with the announcement of the final judgment. The more efficient the court is the more unified its rulings should be. There is some recent evidence which suggests that a delay in the settling of the cases can lower the quality of verdicts (Clark et al., 2018).

Finally, we add the variable revenue_to_expenditure representing the court’s bias toward collecting revenue for the government budget. It should be noted that court expenditures are proportional to the staff size, which is roughly in line with the number of judges employed. If the structure of adjudicated cases is roughly the same in all courts, then the high value of the variable corroborates the preference for the sentences on behalf of the state authorities regardless of the case matter.

We add several control variables to capture the possibility for the disputes between the tax collection agency and the taxpayer. They involve the sales of heating oil at the territory of the court oil_sales (higher sales provide more opportunities for the improper documentation to take place), the debt_collection_time measuring the number of days for the collection of tax arrears by the tax collection agency in a given territory (the tax collection agencies differ in their effectiveness of tax enforcement), and two measures of imputed tax (new tax imposed on the taxpayer as a result of an audit): imputed_tax_per_taxpayer and imputed_tax_per_audit. The last three measures describing the behaviour of the tax collection agency are available only for some years, so their value is taken from the end of our data period and they cover data about all taxes (not only these related to the taxation of oil). They are the proxy for the effectiveness of tax collection agencies in different provinces.

To check the impact of the legal tradition of the territory’s court verdicts (capturing the path dependence) we use a dummy variable German_partition which represents the tradition of the German law on the Polish territory and include the majority of court and cases in the sample. The use of the German partition dummy alone helps to distinguish the most prevalent legal tradition in Poland (including 9 courts) from the two others (including 7 courts) in the logistic regressions. The whole sample divides: 295 cases from courts in former German partition, 146 cases from courts in former Russian partition and 23 cases from courts in former Austrian partition. The small number of cases from courts in the former Austrian partition is probably caused, as Becker et al. (2016) have shown, by the greater confidence in the local administration. Perhaps taxpayers in this territory do not decide to appeal frequently against the decision of the tax authority because they consider the administrative decisions as right. The descriptive statistics of continuous variables are presented in Table 1 and descriptive statitics of cases with respect of the split between German and non-German partitions are included in the Tables 2 and 3.

Table 1 Descriptive statistics of continuous variables for the whole sample
Table 2 Descriptive statistics of continuous variables for German partition
Table 3 Descriptive statistics of continuous variables for non-German patition (Russian and Austrian)

In general one can observe that average oil sales and taxes imposed by tax authorities are lower in German partitions but the value of claim is similar to the average. Jugdes in German partition resolve more cases on average and the courts effectiveness is slightly higher than in Russian and Austrian partitions.

4.2 Econometric results

The three logistic regressions have been estimated (Table 4). In the first model, we include the three groups with only binomial sentences (win or loss) as the dependent variable. The second model applies the ordered logistic regression to better distinguish between a win, a partial win (when a court’s decision is in the favour of the complainant but due to procedural reasons), or a loss. This second regression can be understood as a robustness check of the ordinary logistic model, taking into account the more detailed description of the court’s reasoning affecting the sentence. Winning for procedural reasons triggers the reconsideration of the case by the tax authorities, which may lead to a rethinking of the legitimacy of the decision taken in favour of the complainant. The third model once again uses logistic regression but after dropping off the value of the claim and the courts’ costs. The value of these variables is uncertain because they come from the aggregation of similar cases. Not necessarily all relevant cases are included in an aggregated case (due to the limited information) and sometimes aggregation is impossible when the sentences are on different dates or when they include different types of sentences for procedural reasons. Therefore, the only variable related to the case characteristic in the third regression remains explicit_cheat, confirming the attempt of fraud.

Table 4 The results of the models’ estimations

The first model is referred to as “basic” because it is characterized by the lowest AIC and the best classification of the cases (about 81 per cent of the cases are properly predicted). Incidentally, the lower probability in the Hosmer–Lemeshow test indicates that the form of the basic model is not fitted to the data as well as in the two other models. However, but it does not seem relevant when taking into account the high classification ability of the model. The specification is correct according to a Linktest. A Linktest compares the explanation of the dependent variable by a linear predicted value and linear predicted value squared. To pass the test the first should be significant while the second should not. This condition is met in all estimated regressions but the first model fit is the best. The signs of the estimated parameters are consistent in all of the models.

The value of the claim (value_of_claim) is negative and significant at the 5 percent significance level in the first model, and the 1 per cent significance level in the second model. The courts are more likely to favour the government if the claim is of great value. This is not surprising because minor tax understatements are easier to justify as accidental errors and omissions which is not the case for major understatements. It denies the hypothesis that the courts strived to avoid doing great harm to the taxpayers. Also, as expected, an evident fraud increases the probability of the taxpayer’s loss.

In all but one, the performance and operational characteristics positively affect the verdicts. The one exception is found in the relation of revenues to expenditures. The impact of revenue_to_expenditure is negative in all of the models, though in the last model it is only at 10 per cent significance level. Therefore we cannot rule out the differentiation of the courts in their pro-fiscal attitude. A court with a higher revenue to expenditure ratio (i.e. focused pro-fiscally) is more likely to adjudicate in favour of the tax authorities. This indicates that the administrative court’s judgments can be guided by other factors than legality or fairness and regardless of the type of the case (taxpayers, non-taxpayers, a small or a large value of the claim). A more likely scenario is that some courts manifest preference to the government party in all disputes.

A large number of cases per judge (judges_load) and high effectiveness increase the probability of the judgments to be in favour of the complainant. Perhaps it hints that the experience and the intensive interaction between judges in the larger courts allows them to identify the essence of the dispute with greater ease. It should be noted that case dismisal is easier for the court because it is in line with the tax authorities’ argumentation and it does not require additional explanation (other than the one delivered in the precedent statement of the tax authorities defending their tax decision). Therefore, a positive sentence for the complainant requires the judges to have more experience and confidence in the arguments justifying their statement and requires more effort when justifying the decision. In larger court districts, more rulings are made so judges are preoccupied with smaller cases – as the fraud detection mechanisms bring more cases into the court. It gives a greater opportunity for the judges to decide whether the case is an example of a deliberate action or a petty error.

The higher sales (oil_sales) in the province increase the probability of a taxpayer win. This can be explained as the effect of a better understanding of the specificity of heating oil sales cases by the courts. The bureaucratic behaviour of the tax collection agency affects verdicts. A long history of tax debt collection acts in favour of the government. Similarly, a higher additional tax imposed on the taxpayer reduces the likelihood of the taxpayer winning. However, if the tax collection agencies find higher tax liabilities per taxpayer (which can suggest that there is a concentration on cases of greater value) the probability of the taxpayer winning increases. We cannot find clear evidence of differences in judgments based on the differences in wealth between particular provinces. Only in an ordered logit model does this variable turn out to be positive, indicating that in more affluent provinces the chances of the taxpayer winning are higher.

The value of the constant is significant only in the first model, but only at 10 per cent significance level. The constant in the third model and at the cut-off points in the ordered logistic regression are insignificant. It is especially important because the constant represents the impact of the unknown factors on the court’s sentence. The insignificant constant indicates that the model includes necessary factors affecting the courts’ rulings and that there is no preference for one of the parties which are not explained by the variables used in the model. According to the interpretation of the constant in the logistic regression, the constant indicates the tendency to treat both of the parties in the disputes equally (Whait et al., 2012).

The German tradition of law on the territory of the court is conducive to judgments negative for the taxpayer (and therefore in favour of the government), while the courts operating on the territories with the Russian and Austrian law traditions tend to judge in favour of the taxpayer (see results in Table 5). The probability of a win by the taxpayer in courts operating on the territories of the former German partition is about 20–24 percent (the lowest value is in the first model and the highest in the third model). These results support the idea that the courts in the former Russian and Austrian partitions are more favourable towards the complainants than the courts in the former German partition. In general, the latter prefers a strict interpretation of the regulations but they are not particularly ‘pro-fiscal.’ In contrast, the courts in the former Russian and Austrian partitions seem to discern the meaning of the statutory provisions instead of supporting their judgments with the strict wording of the law. It should be noted that the territories of the former German partition represent the more affluent part of Poland with better-developed infrastructure and institutions.

Table 5 The results of the different model specification

We guess that there can be two channels of the transmission of law tradition over time: inter-generational within-family transmission or traditions of law education in local universities. The first channel can explain the behaviour of people living on given territory as well as judges in the courts. It stems from institutional factors referring to the behaviour of citizens in their contact with administration and administration courts operating on given territory. The second channel affects the lawyers and especially judges learning at local universities and practicing in local administration and courts. These experiences reinforce each other and can be responsible for the preservation of similar legal values and behaviour for a very long time.

To check the robustness of our results, we calculated the similar regressions but for variable (Russian_partition) (second column of Table 5) and separately variable (Austrian_partition) (third column of Table 5). Both variables are positive and significant but Russian_partition only on a 10% significance level. This is consistent with the opinion that courts from these parts of Poland adjudicate differently than those located in the former German_partition. It should be pinpointed that law tradition in the Russian partition can be less established and more diversified because of the lack of administrative courts on this territory in the past.

As a second check we added the interaction of the German_partition with the date of verdict to the model with German_partition (fourth column of Table 5). This new variable controled for the impact of changing line of reasoning in the verdicts. The results of this estimation provides the much higher value of German_partition parameter without worsening significance. The new interaction variable is also significant and positive which stays in line with expectations because the verdicts were changing in the favour of complainants over time.

Finally, we estimated the model with four additional variables referring to the development of provinces in Poland (fifth column of Table 5). It is important because the partitions can exhibit differences in economic and social development blurring the impact of law tradition. The two variables referred directly to the affluence of the provinces: (tax_revenue_per_capita) calculated as an average of tax revenues from Personal Income Tax and Corporate Income Tax per capita in counts located in a province and Gross Domestic Product per capita (GDP_per_capita) in provinces. We decided to use the data from counts in the first variable because the wealth is not equally distributed in a province (the most wealth is concentrated in urban areas) whereas the consumption of heating oil is typical for the poorer regions outside of the metropolitan areas. Contrary to this, GDP per capita is calculated as the average in the province. These two variable taken together should help to distinguish the effect of affluence from the effect caused by path dependence. In general the provinces from former German partition are more prosperous than provinces located in the two other partitions. For example, according to Eurostat the NUTS2 regions of the former German partition were about 18 percent more affluent than regions of the former Russian partition. The development can be also measured by indirect factors of social development. Therefore we added the percent of population in a given province with graduate or portgraduate education (educ_higher) and average life expectancy (life_exp) computed as an average of life expectancy of men and women to capture the differences in social development. This models perform very well and the German dummy is significant at 5% significance level. It seemed obvious that application of additional variables better describing the characteristic of provinces can decrease the significance of the variable representing the partition but this effect turned out to be small.

In all models we have applied ordinary logistic regression because in a basic group of models, the OLR provided the lowest AIC. The sign of coefficients are not changed and the significance is similar to the obtained in the basic models. The constant is significant but its sign is negative in the last model, so in general it can not be perceived as an indicator of a bias to one of the litigants’ side involved in disputes but it is affected by the features of the province development.

5 Concluding remarks

One of the important factors affecting the courts’ decisions is the legal tradition of the territory the administrative court is operating on. The German tradition of law favours legal certainty, while the courts from the former Russian and Austrian partitions are more likely to refer to the principle of justice. Interestingly, the institutional factors can be identified almost one hundred years after the end of the partition period and the unification of formal and material law, corroborating the existence of path dependence. The path dependency is resistant to the harmonization attempts of the Supreme Administrative Court and other courts’ verdicts (the courts cite other verdicts to strengthen their position). Moreover, it is long-lasting since it remains valid after one hundred years after the unification of the country. The most likely of these differences are maintained through the two channels of interaction: inter-generational within-family transmission and traditions of law education on local universities.

The results of the study postulate no preferences in favour of any of the sides in the administrative courts. One of the possible explanations can be how the positions of the administrative court’s judges in Poland are filled. Judges of the administrative courts are taken from civil courts after several years of judiciary experience and the transfer to the administrative judiciary is voluntary (they can simply earn more by working in an administrative court than in a civil court). Finally, the administrative courts are institutionally independent of the Minister of Justice and they administer their own budgets.

The results of the study confirm the impact of institutional factors on the courts’ decisions postulated for example by Weinshall-Margel (2011). It pinpoints that administrative courts work according to the institutional norms of the court, and these norms are linked with the law tradition prevailing in a given territory. We showed that greater experience of the courts fosters verdicts against the tax authorities and this happens independently of other courts. It is consistent with postulates formulated by Cheng (2008), Curry and Miller (2015) and the idea of dissent formulated by Epstein et al. (2011).

Based on the conducted research, one can formulate some important practical postulates. First, that similar cases should be grouped together and resolved in larger courts. It extends the possibilities of exchanging opinion and obtaining relevant experience from other judges. It helps to better balance the certainty of law and justice in controversial cases which impose serious consequences on the complainants.

Second, judges should be able to exchange experiences on a national scale, to enable uniformity of jurisprudence. Otherwise, due to the existing diversity of legal traditions and diversity of experiences in consideration for a given category of cases, the case-law in individual areas of local jurisdiction will be subject to differentiation. This process will be strengthened by the preference for consistency in the adjudication of similar cases in a court along the path dependence. To overcome the existing differentiation of verdicts the binding interpretation should be implemented by higher courts. It is most likely that without such coordination, the direction of the differentiation will stay in line with the legal tradition in a given territory.

Third, courts should control whether their verdicts favour one of the parties of the dispute when compared with other courts. It would require the selection of similar cases and statistical analysis of the verdicts. This postulate can hamper the potential bias in judgments on behalf of one of the parties, introducing an element of early warning against the occurrence of such disproportion.