INTRODUCTION

The series of our reports (four in total; two reports consisting of several parts) together with the accompanying preamble papers (five in total) is focused on the history of the emergence, development, use, relevance, and limitations of causal criteria [19]. What are the origins of this topic, which is important for biomedical disciplines, but little disclosed in the Russian literature on epidemiology and evidence-based medicine?

The origins are as follows. A number of “precautions,” “points”–“viewpoints”–“guidelines”–“judgments”–“criteria”–“postulates,” etc. (essentially synonyms are listed; see sources in [1–3, 5, 6, 8]) were developed in epidemiology (a predominantly observational discipline using an inductive approach [9]) in the 1950s–1970s to assess the causality of chronic noninfectious pathologies to validate the association, following the 19th century Henle–Koch postulates for infectious diseases [2]. Hill’s nine criteria of causality [10] are best known, eight of them were only collected together by this authoritative English medical statistician taking from other authors [2]. However, now the criteria for causality in epidemiology of various fields are almost always called “Hill’s criteria” or “Hill’s guidelines” [2, 3, 58].Footnote 1

Our publications on causality in medicine and epidemiology, as well as methods for evaluating the truth of associations in observational disciplines per se, are based on very extensive material. Previously, we pointed out [3, 5, 6, 8] the use of hundreds of original works devoted to the problem of causality since the 1950s, as well as more than 40 Western manuals on epidemiology (mainly of the last decade), statistics in biomedical disciplines, as well as on carcinogenesis, for many hundreds and even thousands of pages (“Oxford,” “Cambridge,” “Springer,” “Elsevier,” etc.; the whole series, 2018 and 2019). About 30 similar Western manuals reflect the problem of causality for other disciplines: statistics and epidemiology (sic) in ecology, economics, sociology, jurisprudence, and psychology. In addition, about 30 manuals on epidemiology, evidence-based medicine, and clinical trials were in Russian or translated.

Everything possible has been analyzed with regard to the topic of tests of causality and, more broadly, of proofs of causality. Therefore, it is likely that aspects of each criterion have been set out quite exhaustively. This applies both to our reviews focused on specific criteria [37] and to Report 3, which unites all criteria, in two parts [8, 9]. This Report 4, probably in three parts, is devoted to the following problems, which have not yet been covered in detail (there were only relatively brief mentions in earlier studies [19]):

(1) Quantitative and qualitative modifications and additions to the Hill criteria by the authors “after Hill.” They have been (or are expected to be) performed by a few, but authoritative researchers in epidemiology. Looking ahead, it should be said that many proposals in terms of practical use remained, as it were, in vain, although they are very well known in terms of theory.

(2) Attempts to develop gradations of the significance of certain criteria, to determine their weight, as well as to perform their rank rubrification from the standpoint of scientific philosophy (conceptually) or in terms of special reflections of evidence.

(3) Criticizing methods for assessing causality based on criteria and guidelines, as well as the question of how justified and, most importantly, constructive in practical terms.

(4) The extreme breadth of the use of the Hill criteria in one form or another in a variety of disciplines and a variety of international and internationally reputable organizations that develop recommendations on the “Weight of evidence” (WoE) [15], which assess risks and make decisions in areas of public health and safety, despite the fact that a number of authorities (for example, K. Rothman, S. Greenland [1622], as well as “pure” philosophers of science [2326] and [27]) completely deny the significance of the inductive principle and, accordingly, causality criteria for evidence in epidemiology and medicine.

(5) Other methodologies for assessing causality, based not on criteria (“guidelines,” “points,” etc.), but on other models (“causal diagram,” “causal graph,” etc.) [18, 2835]. It is beyond our scope to go into detail about these typically statistical, mathematical, and graphical approaches, which seem to be used infrequently. But it is advisable to get acquainted with them briefly.

The breadth of the practical use of criteria, coupled with parallel theoretical criticism, completely eliminating, at times, their significance, have so far been described by us only fragmentarily.

Part 1 of this report is devoted to the first point.

MODIFICATIONS OF CAUSALITY CRITERIA AFTER HILL

The classical set of Hill criteria considered by us earlier [19] includes nine points in the following sequence [10]:

• Strength of the association

• Consistency of the association

• Specificity of the association

• Temporality

• Biological gradient or dose-response relationship

• Biological plausibility

• Coherence with current facts and theoretical knowledge

• Experiment

• Analogy.

Some sources included bold assertions that almost dozens of authors presented their own modifications of the causality criteria “after Hill” (1965) [10]. D.L. Weed [36] wrote not only about the “predecessors” (included the authors considered by us in Report 2 [2]), but also about the “successors” of Hill, and as many as 19 publications and ten authors are given in the latter case. However, the analysis of these works showed that in fact the real “followers,” who qualitatively modified or expanded the causal criteria after 1965 [10], are like fingers on one hand. Many other sources turned out to be just examples of the use of Hill’s criteria in a particular study, almost always in a truncated and shuffled form. As a result, only a few investigators were correctly named as Hill’s successors, whose constructions we will present below, in the publication [36] of the previously repeatedly cited by us [2, 5, 6, 8] Douglas L. Weed, an American authority at the intersection of biomedical disciplines, law, commerce, and politics, and a specialist in causal criteria [3646].

Similarly, B. Clarke’s lecture on the philosophy of medicine [47], which we have already considered in [2], also contains other examples of “multifactorial causal schemes,” “although Hill’s [scheme] has been the most quoted and used.” An analysis of the sources cited in [47] “before” and “after” the 1965 milestone [10] once again demonstrated that some later works are not relevant to the issue, while others are only reviews and discussions, without their own modifications of the criteria. Again, in terms of “successors,” everything came down to a few well-known authors.

Hundreds of other publications analyzed did not contain any such bold statements about the multitude of authors of causal “principles” or “criteria”; at best, the same few personalities were mentioned (if they were mentioned at all). Therefore, the material presented below is probably, on the whole, exhaustive.

CRITERIA OF BRIAN MACMAHON AND THOMAS F. PUGH: 1970–1996

The World’s First Textbooks on Epidemiological Methods and Modern Epidemiology, B. MacMahon et al.

The named authors, in fact, did not introduce any new causal criteria, but they only carried out the truncation of those known since 1965 [10], like a great many other researchers [38] (see also our publications [2, 5, 6, 8, 9]). But, given the great authority of B. MacMahon as the founder of epidemiological methods of the modern type (i.e., for chronic pathologies), as well as the prevalence of such statements in specialized sources [4852], it seemed inappropriate to bypass him.

Brian MacMahon, Thomas F. Pugh, and Johannes Ipsen published the manual Epidemiologic Methods in 1960 [53]. All authors represented the Harvard School of Public Health, United States1 (list of notes comes after the main text). This manual is regarded as pioneering:

• “Of course, he [B. MacMahon] was a peerless epidemiologist. The 1960 publication of Epidemiologic Methods, the first textbook of modern epidemiology, assures that recognition” (2008) [48].

• “Probably the most influential textbook for the new post-war chronic disease epidemiology was Brian MacMahon, Thomas F. Pugh and Johannes Ipsen’s Methods, first published in 1960” (2011) [49].

• “MacMahon (with Ipsen and Pugh) imprinted the discipline [chronic disease epidemiology] with their 1960 Epidemiologic Methods (2015) [50].

• “On the methodological side, it is sufficient to remember that no text specifically devoted to epidemiological methods was available before 1960, when the book by MacMahon and co-workers (1960) was published” (2015) [51].

• “That book—on Epidemiologic Methods—I studied more than any other in those formative years of my epidemiologic career; and I may have studied it more seriously than anyone else, ever” (2017) [52].

The last quote is from the publication of a well-known epidemiologist, Olli S. Miettinen (United States), who developed, among other things, the concepts of ‘confounding by indication’ and ‘confounding by contraindication’ within the ‘Temporal biases’ [5]. However, Miettinen has an error in the bibliographic reference to O.S. Miettinen’s favorite bestseller [52].2

Thus, MacMahon et al., 1960 [53], is regarded as the first manual for modern epidemiology of chronic diseases, which was formed after the Second World War, “when infectious pathologies were generally eliminated” [2, 6365] (current events revealed the prematurity of the statement). Of course, manuals on earlier epidemiology, mainly for infectious diseases, were published both abroad [61, 6669] and in Russia [54, 70] decades before 1960, starting from the 1920s–1930s and even in the 19th century.3

As stated in note 2, MacMahon et al., 1960 [53] was reprinted further, with various coauthors4:

• 1970: Epidemiology: Principles and Methods (B. MacMahon and T.F. Pugh) [56];

• 1996: Epidemiology: Principles and Methods; second edition (B. MacMahon and D. Trichopolous) [57];

• 1997: the same, but under the authorship of only B. MacMahon [58].

None of these publications, as already noted, were available to us.

The First Edition of B. MacMahon et al., 1960, Had No Criteria for Causality

It is believed [1, 2] that the first complete summaries of causality criteria appeared only in 1964 (five statements in the U.S. Surgeon General Report on the Effects of Smoking) [79] and in 1965 (nine points from Hill, 1965) [10]. However, almost all of these criteria were formulated by other authors earlier, in the 1950s and early 1960s, sometimes in the complexes of 3–5 points ([2] and the footnote above). But this concept had not yet entered world epidemiology until 1964 and 1965. Therefore, there are apparently no data (not cited by other authors) on causal inferences in the first edition of MacMahon et al., 1960 [53], but there are fivegeographicalcriteria that can confirm the conclusion that a particular region is associated with diseases. Specific data from [53] were reconstructed by us according to [80] and were not found elsewhere.5

A historical and bibliographic study by Zhang et al., 2004 [61], analyzed epidemiological methods and concepts in relation to manuals on this subject for the 20th century. It considered, as stated there, “MacMahon & Pugh (first edition 1960; version reviewed 1970).” The summary table shows data for the 1970 version [56], and mentions “five causal criteria in Hill’s version”: “five criteria to evaluate causal association”; “In the books of McMahon & Pugh, Susser, and Lilienfeld & Lilienfeld, we essentially find different versions of Hill’s causal criteria” [61].

As noted, not a single original edition of 1960–1997 was available to us [53, 5658]. However, we formally found only three such criteria for Hill-type criteria in MacMahon and Pugh, 1970 [56], but not five. The search for material required a special approach.6 Thus, the criteria of causality and “geographical” conditionality are confused in a case study by Zhang et al., 2004 [61]; there are errors of both a qualitative and a quantitative (number of criteria) nature (this was also indicated in note 2). There is a similar qualitative error in Lagiou P., et al., 2005 [62] (italics ours): “Criteria for inferring causation from epidemiological investigations have been proposed, over the years, by several authors, including MacMahon et al., 1960.”

Three Causality Criteria in MacMahon et al., 1970 and 1996: Nothing New but One Name

The result of our reconstruction of the text (see note 6) for the manual dated 1970 [56] is as follows. Formally, three causal criteria are identified (the following are the full originals, since, as stated they are difficult to find or reconstruct):

(1) “Time dependency”: “Time sequence. For a relationship to be considered causal, the events that are considered causative must precede those thought to be effects. When the sequence of events cannot be determined precisely (a frequent situation in chronic disease), at least the possibility of such a sequence must exist.”

(2) “Strength of the association” + “Biological gradient”: “Strength of the association. The stronger the association between two categories of events (for example, the higher the ratio of the incidence of K following A to the incidence of B without A), the more likely it is that the association is causal. If the exposure suspected cause is a quantitative variable, the existence of a dose–response relationship, that is, an association in which the frequency of the effect increases as the cause to the cause increases, is usually considered to favor a causal relationship, although even in a causal relationship, such an association may not exist over the entire range of exposures to the cause.”

(3) “Biological plausibility,” although referred by some other authors [6, 9] to “Coherence with current facts and theoretical knowledge”: “Consonance with existing knowledge. Here some consideration come into play: (a) A causal hypothesis based on epidemiologic evidence is supported by knowledge of a cellular or subcellular mechanism that makes it reasonable in the light of existing knowledge in relevant sciences. In the absence of this support, there should at least be the belief that such mechanisms are possible. (b) Evidence that the distribution of the disease in the populations follows the distribution of the supposed causal factor supports a causal hypothesis. Major discrepancies between the two patterns, not reconcilable in terms of other causal factors or explanations, tend to weaken a causal hypothesis.”

Consideration of the temporal order in the first point should be recognized as original, if only the possibility of such order exists (this moment and the reference to [56] were omitted in our publication on “Temporality” [5]). Point (b) from the third point is rather the mentioned criterion “Temporality” or Susser’s criterion “Direction” [8591] (more on that below). The term “consonance” in relation to the corresponding criterion was not found anywhere else in hundreds of sources.

Thus, the points from MacMahon and Pugh, 1970 [56], contain in fact five criteria, including one that is “not Hill’s.” But Zhang et al., 2004 [61], did not delve into such subtleties (nothing was stated) and clearly confused the causality criteria with the mentioned five “geographical” principles from the first edition of 1960 [53].

Only three causal criteria also existed in MacMahon and Trichopolous, 1996 [57]. They are reproduced in the review by Scheutz and Poulsen, 1999 [59], with an “only three” emphasis, and that is why it is “simplified”:

(1) “The cause must come before the disease.”

(2) “The strength of the association.”

(3) “Concordance with current knowledge.”

Thus, our reconstruction of the criteria from the 1970 edition [56] is confirmed by the 1996 edition [57], although Scheutz and Poulsen, 1999 [59], did not pay attention on the inclusion of the “dose–effect” relationship and the concept of “direction.”

It is difficult to say who replaced the term “consonance” with “concordance” by 1996–1999, which was used later in the complex of modified Hill’s criteria by environmental organizations [9294]. The term “concordance” is used precisely with regard to the causality of effects and only by the mentioned organizations in various references [95].

Obviously, what we have considered within the framework of the section is inadequate in terms of significance to the declared and expended efforts. MacMahon et al. did not offer any criteria of their own. But, as it was stated, the eulogizing conjuncture in epidemiological sources and the constant flickering of this researcher in topics about causality forced us to conduct almost detective-like bibliographic research and pay unjustified attention to this issue. At least, the data presented above on the topic are the most complete among all the sources known to us and hit all the accents.

THE CRITERIA OF M.W. SUSSER: 1973–1991

Susser’s Credo for Social Epidemiology and Reliance on Public Health Goals

Mervyn Wilfred Susser (1921–2014) is regarded by the historian of epidemiology A. Morabia [50] as one of the three leading researchers who laid the foundations for a socially oriented, public health-oriented epidemiology of chronic diseases.7 The rest of the figures, MacMahon and Abraham Morris Lilienfeld (1920–1984),8 are already known to us [50].

According to [50], Lilienfeld and MacMahon adhered to similar ideas when formulating a causal hypothesis for chronic pathologies. They based the latter on the triad “People, place, and time” and developed a two-stage research strategy: creating hypotheses based on the analysis of vital statistics, comparing data by demographic, geographical, or chronological parameters (remember the five “geographical” criteria in MacMahon et al., 1960 [53]), and testing hypotheses in analytical studies, especially by controlling the situation [50] (that is, according to the counterfactual scenario [7, 8]).

In the opinion of Morabia [50], Susser had in mind a different causal structure, which he outlined in lectures in 1966, as head of the department of epidemiology at Columbia University, New York, although he agreed with the two-stage strategy [50]. The premise was “the challenge to articulate to skeptical educators the need to transition to an era of chronic disease epidemiology” [97, 98].

The main thing was not the statistics of the vital movement of the population, but the architecture of the epidemiological association and related concepts: confounders (intervening factors), mediation, and interaction. An “ecological model” built on the basis of the triad “Agent, host, and environment” was proposed for chronic diseases. It is similar to that previously used in the epidemiology of infectious pathologies by W.H. Frost and J. Gordon [50]. But the causal model, based on the continuous interaction of the agent and the host with the environment, turned out to be too complex to be studied and used (clumsy model of causality, as Susser himself pointed out [85]). As a result, this author formulated a standard set of epidemiological designs (cohort, case-control, and cross-sectional studies) and concepts (confounding, interaction, mediation, and causal inference; in addition to bias) to assess the validity of causal hypotheses [50].

Susser “imported” the use of arrow plots (“acyclic graphs”) from sociology in 1973 [85] to display the relationship between several variables, causal paths, and potential effects of confounders, distortions, suppressions, clarifications, mediators, moderators, and component variables [50].

In short, the complex interaction of epidemiological factors with social and environmental factors, as well as the emphasis of scientific findings on public health practice was the basis of Susser’s causal thinking, which distinguishes him from other researchers [50, 97, 107]. “An epidemiologist’s responsibility for translating epidemiological data into action” [108].9

Attention was paid to both genetic and neurological factors (“different levels of causation”) [97].

The Monograph by Susser, 1973, Is Considered the First Book-Length Publication on Causality in Epidemiology and Medicine

The content of the 1966 lectures was published by Susser in 1973 as a monograph entitled Causal Thinking in the Health Sciences: Concepts and Strategies of Epidemiology [85]. This book (better known as “Causal Thinking”) quickly became recommended for reading in epidemiology programs in the United States and probably worldwide; the book has been translated into Chinese, Japanese, and Spanish [50].10 The manual is considered [107] (including by Susser himself [102]) as the first study on the methods of establishing causality and causal inference in the volume of a book. It should be clarified that, firstly, we are talking about biomedical disciplines (since David Hume did something similar in philosophical terms as early as the 18th century [1]) and, secondly, about Western authors. After all, I.V. Davydovskii (see note 8) already in 1962 had published the monograph “The Problem of Causality in Medicine (Etiology)” [104]. Despite the fact that there was no methodology for solving this problem (the book is more descriptive and stating; it only raises problems), there is a certain Russian priority. The work of Davydovskii is still a reference book for inquisitive doctors of the older and middle generation (private communications), who probably have not seen anything better.

In addition, there is at least a monograph by the sociologist H.M. Blalock, 1964 [109], entitled Causal Inference in Non-Experimental Research, although it does not deal with biomedical disciplines, judging by a review of this publication [110].

Despite the declared importance and relevance (it is still cited), the monograph by Susser, 1973 [85], apparently has not been published in an electronic version: even traces of one are not detected. Therefore, the material from it had to be reconstructed from other publications (including M. Sussser himself [8688, 99102]). With regard to the topic of our communication, this is not so important, because the main material on the criteria of causality is presented by the indicated author in later, accessible papers of 1977–1991 and, a little, in the 2001 dictionary [8688, 99102].

Criteria of Susser: Independent, Refinement of Criteria in Parallel with A.B. Hill from the US Surgeon General (Chief Medical Officer) Report on Smoking and Health Dated 1964

For the first time, such criteria were included in the mentioned monograph in 1973 [85], and Susser later noted that he developed them on the basis of five criteria from the Surgeon General = Report of the United States on the consequences of smoking [79] (formulated earlier by R.A. Stallones in the 1963 draft [111]; see also the footnote at the beginning), but it was done regardless of the nine points of Hill, 1965 [10]:

“In addressing the evidence on smoking, the report listed and described (if not very adequately and without citing the literature) five criteria for judging causality in a given association… This codification gave rise to two independent elaborations, one by Hill [10] and the other by myself [85]” [87].11

“In ignorance of Hill’s paper, I developed my own discussion of causality in order to meet the burgeoning tasks set by the multivariate age of epidemiology then emerging” [87].12

However, ignorance of the paper [10] is plausible for the 1966 lectures at Columbia University, but hardly for the 1973 monograph. It should be noted that the above clarifications about “independence” appeared only in the last work dated 1991 by the almost 70-year-old Susser. His other publications on causality criteria (1977–1988) [86, 99102] contain nothing of the kind, although the data from the 1973 monograph are discussed there.

Later, statements about the independence of the Susser’s criteria from the criteria of Hill were repeated by other authors [90].

Susser’s Eight Criteria for Judging Causality: Three Are Original

The first three criteria of causality listed below, which Susser referred to as absolute requirements, were taken by him from sociology. The list of points did not immediately acquire a final form; the criteria were supplemented in the 1970s–1980s [86, 99101].

Association” (or “Probability”). The presence of a statistically significant association (its probabilities) was not discussed in the 1973 monograph as an a priori criterion: after all, association is a prerequisite for assessing causality [107]13 (sometimes there is an erroneous attribution of the point “Association” to the publication of 1973 [90]). There was no such guiding principle in the thematic publication of Susser, 1977 [99], and in his paper on causality of June 1986 [86], but the point had appeared by November 1986 [100]. The criterion is called “Association” by Susser, 1991 [87], and the other authors [89, 91, 107, 113, 114], but Susser singled it out under “Probability” in 1986 [100] and in 1988 [101]. According to the work of one of the founders of causality criteria in ecoepidemiology, G.A. Fox, 1991 [115], Susser was the first to introduce probability into the causality criteria. In our opinion, it is apparently the last one, since the effect size in epidemiology and other observational disciplines does not prove causality, especially regarding the magnitude of correlation coefficients [4, 8].

M. Susser himself also pointed out [100] the relative weakness of the criterion of statistical significance. Although the approach makes it possible to draw an approximate conclusion about how much attention to pay to one connection or another [87, 100], the point is not that important in making a decision. Lack of significance provides quantitative, but not logical, grounds for rejecting the epidemiological hypothesis, since at least a test of statistical power must also be added [100, 101]:

• lack of statistical significance with sufficient power: one can falsify and reject the hypothesis according to the provisions of K. Popper;

• lack of statistical significance with insufficient power: the test is uncertain;

• the presence of statistical significance with insufficient power: the result is positive.

What has been said, however, is already obvious to all researchers; therefore, probably, no one, except Susser, began to introduce an a priori clear point in the complex of causal criteria.

Time order. This criterion, clearly, already had a place in the monograph of 1973 [87, 101]. This name for the point “Temporality,” among a great many different synonyms and terms, was considered by us earlier as the most successful. But no one, except for Susser himself and those citing his works, used it [5]. This criterion, which says philosophically that the impact must be before the effect and epidemiologically that appropriate latent periods must be observed, was discussed in detail by us in a separate review [5] and more briefly in [8]. All constructions by Susser are included in [5, 8]. Susser was apparently a pioneer who logically put this criterion in the first place [5, 8] (except for the fact of the association itself), which was not the case in 1964 (US Surgeon General (Chief Medical Officer) Report on Smoking and Health) [79] or in 1965 (A.B. Hill) [10].

Direction. The term “Direction,” taken by Susser from the aforementioned monograph by the sociologist H.M. Blalock from 1964 [109, 110], is a synonym for the causal property of D. Hume “connection”: “repeatedly demonstrable, hence predictable, linkages existing between cause and effect” [88].

This criterion is not singled out by Susser separately in his lists from 1986–1988, and it is considered there together with “Time order” [86, 101]. But “Direction” is included in the list of “Susser’s criteria” on an equal footing by Susser, 1991 [87], and a number of publications by other authors (1994–2011) [8991, 108].

According to S.S. Coughlin, 2010 [113], the presence of an association, the time order , and direction are essential integral properties of causes, and not criteria for identifying causal associations (according to Susser). After analyzing the original publications of Susser [86, 87, 99101], we did not find such statements in them; moreover, the association was named a criterion in 1988 [101],14 although it was included in the Properties of Causes. In the last relevant work known to us, Susser’s dictionary of causation from 2001 [88], “Association,” “Time order,” and “Direction” are considered separately from the causal criteria, indeed, under the heading “Properties of the cause” and with reference to the work of D. Hume (1739) [88]. But his students said [our italics, A.K.] in a memorial article on the death of Susser in 2014: “Susser’s set of causal criteria prioritizes 3 elements as sine qua non [obligatory and necessary conditions]— association, time order, and direction—and then follows with 5 additional elements...” [108]15. The same is stated in the review by J.S. Kaufman and C. Poole, 2000 [107], about the causal principles of Susser (“…three criteria to the status of absolute requirements…”).16

All this vagueness on which point is a criterion, which one is a “subcriterion,” and which one is a “supercriterion” (a priori condition), testifies to the continual attempts of Susser to rethink and refine his causal complex. We will consider the first three points of Susser as mandatory criteria, although they are included in the “Properties of Causes.”

According to the work of Susser [101], he noted the inseparability of two properties of causality back in 1973: Time order and Direction (that is, X leads to Y). There is no such criterion yet in the paper dated 1977 [99]; it appears in 1986 [86].

The direction emphasizes the asymmetry between cause and effect [86, 87]. This point is analyzed in detail in the publication Susser, 1991 [87], in relation to the symmetry and asymmetry of effects, and it turns out that the criterion is intended to include a check for reverse causality by all indications (see [5, 8]) on the influence of the third factor, i.e., the confounder [1] and the plausibility, the improbability of the time dependence (these constructions are also reproduced in the memorial article of his students [108]).17 In other paper [86, 101], it is stated that “the direction is best seen in interventions: something added or removed, as in the results of a randomized experiment.”

Thus, this obvious point is unlikely to be of significant value, in our opinion; in any case, Susser and others do not give any examples on the expediency of its special application [87, 101], and it does not seem appropriate to include both third factors, reverse causation, etc., in “Direction.”

If two of the first three points (“Association” and “Direction”) were original for Susser, then the next five appeared in the Report of the US Surgeon General (Chief Medical Officer) Report on the consequences of smoking from 1964 [79], and in the work of A.B. Hill, 1965 [10], although Susser added some provisions.

Strength of association.” The criterion was included, apparently, already in the monograph of 1973 [85], and then it was repeated in all subsequent works by Susser on the topic of causality of effects. In 1986, 1988, and 2001, he was at the first place after the three obligatory points [86, 88, 101], and, for some reason, in second place, after “Consistency,” in 1977 and 1991 [87, 99]. This criterion is considered in detail in our reviews [3, 4] and more briefly in [8].

Specificity. This point was also, by all indications, considered in the 1973 monograph [85], and then it was repeated in all works on causal approaches [8688, 99101]. Comprehensive information about this criterion was presented by us earlier [8].

Consistency of association (Consistency + Surviability). This criterion is considered in detail in our review [8]; its philosophical essence is based on an inductive approach: data replication is the basis of such an approach to proving causality, but it does not provide evidence according to the deductive methodology of Popper. The epidemiological point is that the reproduction of data, both authentic and with qualitative and quantitative modifications, reduces the likelihood of outside interference. The obtaining of homogeneous results by different methods is a particularly important reinforcement of causality, since the same errors, biases, or confounders are possible in works with the same methodology [8].

The criterion already existed in the monograph dated 1973 [85], and it was called ‘Consistency on replication’ in 1977 [99], occurring mainly in this form until 2001 [86, 88, 100, 101]. Realizing, however, that this inductive approach does not fully reflect the essence of the situation, Susser introduced an additional deductive “subcriterion” “Survivability” (as a subclass of “Consistency” [87]) in 1991 [87])—when testing the hypothesis by various tests. The criterion reflects changes in the design of studies and the “survival” of the hypothesis with such changes [87, 88].

Predictive performance.” This original criterion (as applied to epidemiology, of course) was introduced by Susser in 1986 [100] in an work about the use of Popper’s approach in epidemiology (for more details about “Popperian Epidemiology,” see [9]) and remained on the list in 2001 [87, 88, 102]. The approach introduces the principle of deduction into the complex of causal rules, although M. Susser criticized the attempts of “Popper’s epidemiology” to eliminate the inductive principle from this discipline in 1988 [102] (for more details, see our review [9]).

The predictive efficiency, of course, is determined deductively: by the ability of a causal hypothesis derived from an observed association to predict an unknown fact that is a consequence of this association [87, 100, 101] (exact formulation, according to [87]18).

The effectiveness of the forecast follows only from the testing and evaluation of hypotheses, which can be extracted, as stated, from the original association. Susser pointed out that J. Mill (John Stuart Mill) in 1843 did not support the “predictive idea,” since the consequence predicted from the theory leads to a proof (carried proof) of something no more than already known knowledge, although the methodology for establishing causality for Mill was the search for evidence, and not refutation, as it was for Popper [100, 101].

The essence of these philosophical constructions is that if the prediction from the association is falsified (not confirmed), then the point may not be in the untruth of the original association, but in the incorrectness of the test used to confirm or in the insufficient quality of the predictive approach. But even if the prediction is confirmed, the conclusion about the receipt of new data should be made with caution (probably for the same reason) and supported by other causal criteria [100, 101].

Susser pointed out that the principle of predictive power was supported by the famous philosopher of science Imre Lakatos, calling it “excessive confirmation” and “sophisticated methodological falsification” in Popper’s approach to rejecting hypotheses [100, 101].

Here it is appropriate to end the philosophical discussion; at least some practical examples of “predictive efficiency” seem important. Susser gives [100, 101] only one, anecdotal example, which we have already analyzed earlier in [1, 2] (see note 8 in [1]). At the beginning of studies on the association of lung cancer with smoking, there was a presumption that females were immune to this disease, since it occurred less often in women (they even tried to treat this cancer by introducing female sex hormones [1]). But some authors made predictions based on the latent period of lung cancer and the fact of the later time when women, compared with men, began to smoke en masse. That is, the counter-hypothesis predicted that the incidence of lung cancer among later female cohorts would increase, which was observed [100, 101].

Another example, already unsuccessful, is a long-standing study by Susser of the causes of peptic ulcers, which resulted in a conclusion about early social and environmental factors that cause pathology [85, 98]. Data on stress levels were provided for different populations/nationalities, coupled with the spread of peptic ulcer disease, on which, in principle, the predictive ability was based. Time, as is known, has shown that the etiology of this disease is infectious [98].

In our reviews on the “Biological plausibility” criterion (in the thematic paper [6] and more briefly in [9]), one can find examples where the “predictive efficiency” and predictive hypotheses, despite their attractiveness, were so lame that this led to tens of thousands of victims. It is not entirely clear why M. Susser included this “criterion” in his list and, apparently, adhered to it to the end. This is not a criterion and not a methodological approach. How is it possible at this particular moment, when it is necessary to evaluate the truth of an association for practical steps, to use a method that gives results only somewhere and sometime in the future? Most likely, such a criterion can be used “backdating” by analyzing the already available data of a different plan. G.A. Fox, 1991 [115], introduced the “Prognostic efficiency” of Susser into ecoepidemiology; this point was included in the list of causal criteria of the US Environmental Protection Agency (US EPA or USEPA) in 1998–2000 [117, 118], but later it is no longer found in it [33, 93, 95] (“environmental” causality criteria are discussed below).

Consistency with current facts and theoretical knowledge (“Coherence” or “Coherence or plausibility”). The criterion appears in a 1973 monograph [85] as simple “Coherence,” with a similar meaning to the same point in the US Surgeon General (Chief Medical Officer) 1964 Report on Smoking and Health (“The coherence of the association”) [79] “Coherence with known facts in the natural history and biology of the disease”. This is also a separate item in the set of Hill’s criteria [10].

The simple “Coherence” criterion occurred further in the works of Susser dated 1977–1991 [86, 87, 99101], but the 2001 Glossary of Causality [88] expanded this guideline: “Coherence or plausibility (theoretical, factual, biological, statistical).”

Thus, there are four levels of consistency. They were first formulated in 1986 [86, 100], and then repeated in 1988 and 1991 [87, 101] (also listed in the dictionary of 2001 [88]). These points are analyzed in [6] and then expanded in [8]. In short, we can say that, according to Susser, the criterion includes the following elements of consistency: (1) with theoretical plausibility (the data must be plausible from the standpoint of the existing theory), (2) with facts, (3) with biological knowledge (i.e., “Biological plausibility”), and (4) with statistical regularities, including the dose–response dependence [86, 87, 100, 101].

As we can see, the criterion absorbed both “Biological plausibility” and “Biological gradient” (dose–effect relationship), which had a separate status in the Hill complex [10]. Thus, in fact, Susser gets ten criteria (even 11 with the subparagraph “Survivability”) against nine criteria from Hill. Seven of the latter overlap with Susser’s complex. This author lacks Hill’s “Experiment” and “Analogy” criteria [9], but adds “Association,” “Direction,” the sub-point “Survivability” (hypotheses), and “Predictive Performance.”

Was all this really necessary for a practical assessment of the truth of epidemiological associations?

The Value of the Set of Susser’s Criteria Is Questionable

Susser is regarded as an authority in Western epidemiology; his name is as great (though not equal) as the name of Hill. For example, a 2498-page epidemiology manual dated 2014 states that “Hill’s [1965] paper was not replaced by later attempts to enrich it (Susser, 1977) or supplanted by attempts to limit causal inferences to deduction rather than induction (Buck, 1975; Rothman, 1988)” [119]19 (the last point about the virtual “epidemiology of Popper” was considered by us earlier [9]). It can be seen from the quotation that Susser’s efforts in the field of causal criteria are regarded as “attempts to enrich” them.

The same, but already in an affirmative tone, can be seen in some other sources, for example, in the Report of the US Surgeon General Chief Medical Officer on the consequences of smoking dated 2004 [120]: “Susser has significantly (extensively) improved the criteria…”. And again, “Susser’s historical analysis argues against ossified causal criteria…” [31].20

The name of Susser in relation to the criteria of causality is mentioned in the majority of Western textbooks on epidemiology; among those cited above, for example, in [18, 31, 51, 113, 114]; it is also widely represented in the two main Oxford dictionaries of epidemiology [121, 122].

The following is stated in an apologetic publication by Kaufman and Poole, 2000 [107], which is often cited:

“In this area, Susser has worked essentially alone to lengthen the list of criteria for judging causality, to arrange the criteria into hierarchical categories, to distinguish their roles in affirming and refuting causality, to explore their interrelations, and to begin to quantify their contributions to causal judgments. As his system of causal criteria becomes more elaborate, however, it has raised questions pertaining to Kuhn’s distinction between the function of scientific criteria as values or as rules.” 21

“Susser’s discussion of causal criteria occupies only a brief 22 pages in the original text [1973 monograph], but it helped spur a vigorous discussion of the use of such criteria, which has persisted unabated to the present day, including substantial refinements by Susser himself”. “Susser’s elaboration and expansion of this list over the ensuing years [1977–1991] forms the most detailed and prolonged attempt to develop criteria for causality in the field of epidemiology” [107].22

It was also pointed out [107] that the use of causal criteria is only one of the five strategies of evidence in epidemiological studies, which were included in the monograph by M. Susser, 1973 [85].23

It seems important to a number of authors [107, 125] that Susser tried to introduce a hierarchy of criteria [87, 100, 101] and their weighting (the section on the hierarchy and weighting of criteria is planned to be presented in the next publication). The ‘scoring system’ [33, 125] and a point scoring system [33, 107] are indicated, but we did not find such data in the relevant publications by Susser [8688, 99101]. Hierarchy consists, perhaps, in a built-up sequence of criteria according to the list above.

It seems to us that the significance of the developments of M. Susser in terms of causal criteria is exaggerated, and such a conclusion was made only during the preparation of this report, with intensive delving into the relevant material, but not in earlier reviews [19]. Susser, indeed, paid a lot of attention to attempts to turn a set of criteria from a basis for judgment into certain rules, but all this does not appear very vital. The criterion “Direction,” which is a priori clear from “Temporality,” is clearly superfluous, as well as the allocation of the original association as a separate item. The “viability” of the hypothesis when tested by studies with other designs is also not very practical, since it is not clear how to use it in practice, how many designs are needed, and whether it is possible to draw conclusions with different interpretations of the results. It was already mentioned above about the deductive criterion “Predictive performance.” that this methodology, which is largely focused on facts from the future, is hardly applicable in cases where it is necessary to obtain a prompt response, for example, for urgent measures in health care and social epidemiology, in which Susser, as noted, is considered one of the founders.

As a result, we are not aware of any epidemiological work in which “Susser’s criteria” would be used for evidence. Meanwhile, there are a lot of such publications for the “Hill criteria,” and some of them were analyzed by us earlier [6, 8, 9]. On the other hand, as mentioned, some of the guiding principles of M. Susser formed the basis of a special set of causal rules for ecoepidemiology [115, 117, 118, 125] (and others; detailed below).

UNIFIED POSTULATES OF A.S. EVANS FOR INFECTIOUS AND CHRONIC PATHOLOGIES: 1976–1993

Although the Postulates Are Known, They Are Published in Few Places

If it can be said for MacMahon and his collaborators and for Susser that their wide popularity in terms of causality criteria hardly corresponds to a real contribution, then it is, in our opinion, rather the opposite for Alfred Spring Evans (1917–1996; United States). This researcher, an authoritative specialist in the causality of infectious pathologies (including viral forms of cancer, etc.), developed Unified “postulates” of the causality of infectious and chronic diseases together.24

We know three lists of postulates, all of them are somewhat different, and all of them are authored by A.S. Evans. It is similar to what happens with ancient chronicles or sagas. One list, complete (ten points), is included in the paper of Evans, 1976 [127]; another one, shortened (eight points) is included in the paper of Evans, 1978 [128], and the third one, logically the most complete (again ten points, but more detailed), is included in the monograph of Evans, 1993 [129].

Despite the relative prominence of both Evans and his combined postulates, they are not even mentioned in most epidemiology textbooks. Some of them were cited here earlier [18, 35, 51, 65, 69, 114]. There are separate references to this author (more often about his historical reviews on the postulates of Henle–Koch [127, 128]) in a few sources [31, 82, 113, 120, 130, 131] (only the references used above are given again). But a complete list, and only in the 1976 version [127], was found by us only in the online dictionary of clinical epidemiology and evidence-based medicine by J. Gay, 2005 [132], and only in one of the two [121, 122] Oxford Dictionaries of Epidemiology, edited by J. Last [121]. The last dictionary has been translated into Russian and, thus, the translation of the postulates of Evans can be called “synodal” [121].25 We found its citation in a Russian paper on the etiology of infectious diseases [133], but not in Russian manuals on epidemiology [65, 134, 135] (and others). There is a list of postulates from 1976 and on near-medical sites on the Internet.

Since the second list from Evans, 1978 [128], is for some reason shortened and, as it were, “from memory” (such an impression), the first and last versions are presented below.

Postulates of Evans from 1976: Only They Are Known in Other Sources

From the Oxford Dictionary of Epidemiology edited by J. Last; 2009 [121], citing [127]:

(1) Prevalence of the disease should be significantly higher in those exposed to the putative cause than in cases controls not so exposed [criterion “Association”].

(2) Exposure to the putative cause should be present more commonly in those with the disease than in controls without the disease when all risk factors are held constant [“Case–control study”].

(3) Incidence of the disease should be significantly higher in those exposed to the putative cause than in those not so exposed as shown in prospective studies [cohort study].

(4) Temporally, the disease should follow exposure to the putative agent with a distribution of incubation periods on a bell shaped curve [criterion “Temporality”].

(5) A spectrum of host responses should follow exposure to the putative agent along a logical biologic gradient from mild to severe [criterion “Biological Gradient”].

(6) A measurable host response following exposure to the putative cause should regularly appear in those lacking this before exposure (i.e., antibody, cancer cells) or should increase in magnitude if present before exposure; this pattern should not occur in persons so exposed (surrogate endpoints) and “Biological gradient”.

(7) Experimental reproduction of the disease should occur in higher incidence in animals or man appropriately exposed to the putative cause than in those not so exposed; this exposure may be deliberate in volunteers, experimentally induced in the laboratory, or demonstrated in a controlled regulation of natural exposure [criteria “Biological plausibility” and “Experiment”].

(8) Elimination or modification of the putative cause or of the vector carrying it should decrease the incidence of the disease (control of polluted water or smoke or removal of the specific agent).

(9) Prevention or modification of the host’s response on exposure to the putative cause should decrease or eliminate the disease (immunization, drug to lower cholesterol, specific lymphocyte transfer factor in cancer) [criterion “Counterfactual experiment”].

(10) The whole thing should make biologic and epidemiologic sense [criteria “Biological plausibility” and “Coherence with current facts and theoretical knowledge”]26.

As we can see, the studies of two epidemiological designs at once “case-control” and cohort are added to the list of Hill’s and Susser’s criteria. Such an extended approach would clearly greatly enhance the evidence for causality. The only question is why the two named standard types of research, which may or may not be carried out (especially together), are called “postulates” of causality.

Postulates of Evans Dated 1993: “Henle–Koch–Evans Postulates”

Evans published a monograph titled Causation and Disease: A Chronological Journey in 1993 [129]. In this work, the 76-year-old author conducted, among other things, a historical digression into the development of the principles of establishing causality, including professional medicine. The stages of formation of causality criteria are considered, partly before Hill and, in detail, from his paper in 1965 [10]. It should be noted that Evans did not name Hill (unlike Susser) in any of his earlier publications on causality [127, 128]. We assumed that the mention of Hill only in the 1993 monograph is perhaps caused by the fact that it was posthumous (Hill died in 1991) [2]. In any case, the reviewer of the 1993 monograph [129], M.E. Wegman, expressed clear satisfaction that “Evans quotes the great biostatistician, Sir Austen Bradford Hill, at length” [136]. Apparently, “finally” was meant.

The monograph of Evans, 1993 [129], is not freely available, but its partial display in Google Books made it possible to reconstruct the necessary material. We are not aware of sources that quote or reproduce it.

Evans himself, presenting his postulates in the 1993 monograph [129] (and specifying that he also developed them, “…Hill’s (1965) paper [10], although I was not aware of his publication at the lime I wrote mine”), indicates that it was not him, but the lawyer Bert Black and the epidemiologist David Lilienfeld who “have proposed another set of guidelines that need to be fulfilled for epidemiological proof in toxic tort litigation” “They term the guidelines the “Henle–Koch–Evans” postulates. They represent what I termed, with tongue-in-cheek, a “Unified Concept of Causation” (Evans, 1976)” [129].27

Why did 76-year-old Evans suddenly begin to speak with irony about his postulates of 1976–1978? Probably because they were not widely introduced into the causal practice of medicine from 1976 to 1993. For example, the reissue of Foundations of Epidemiology by A.M. Lilienfeld in 2015 [137] states the following: “In 1976, Evans synthesized a framework applicable to both infectious and noninfectious diseases—the Unified Concept of Causation. Epidemiologists did not, however, adopt this conceptualization.”

The original publication by B. Black and D. Lilienfeld, 1984 [138], with postulates, published in the New York Law Journal, was found, and there really are ten points of “Henle–Koch–Evans,” but again they do not coincide with those that issued for them in the monograph by Evans, 1993 [129]. Most of all, the points from [138] are similar to those from Evans, 1976 [127], but the words are partially different. This is the fourth list of Evans’ Postulates.

It remains only to bring below the material presented by Evans in the 1993 monograph [129].

Thus, “Postulates of causation for occupational diseases” from Evans, 1993 [129], where it is indicated that the material was taken “from A.S. Evans, 1986” (we did not find a reference); “with permission”:

(1) Prevalence of the disease should be higher in those exposed to the putative causes in an occupational setting than in those not so exposed either in the same setting or other similar settings; if possible, this should be shown in matched controls [“Association” and “case-control” study criteria].

(2) Exposure to the putative cause should be clearly demonstrated by historical and/or laboratory data to have occurred more often in those with the disease than in those without the disease when all other factors arc held constant and be shown more likely than not to have caused the disease [“Case-control” study].

(3) Risk of developing the disease should increase with the duration and intensity of exposure to the putative cause [“Biological Gradient” criterion].

(4) Incidence of the disease should be higher in those exposed to the putative cause than in those not so exposed as shown in prospective studies [cohort study].

(5) Temporally the disease should follow exposure to the putative cause in that workplace and both exposure and disease should be absent prior to starting work in that workplace [“Temporality” criterion].

(6) Other causes of the same disease outside the workplace should be excluded or, if present, the attributable risk of each exposure assessed [search for interfering factors (confounder); partly the “Specificity” criterion].

(7) A biological gradient of response to the putative cause should regularly appear or should increase following exposure to the putative causes as shown by objective evidence [“Biological Gradient” criterion].

(8) Elimination or modification of the putative cause, or the vehicle carrying it, or protection of the worker against it, should decrease the incidence of the disease [“Counterfactual experiment” criterion].

(9) Experimental reproduction of the disease should be demonstrated, if possible, in susceptible animals or humans exposed accidentally or deliberately to the putative cause [“Biological plausibility,” “Experiment,” and “Consistency with current facts and theoretical knowledge” criteria].

(10) The relationship between cause and effect should be shown in several studies, make biological and epidcmiological sense, and be consistent with the natural history of the disease [“Consistency of association” criterion; “Biological plausibility,” “Experiment,” and “Coherence with current facts and theoretical knowledge” criteria].28

Still, the above list seems to be somewhat cumbersome and, in some places, excessive in terms of repetitions for practical use. On the other hand, the universality of the criteria for all pathologies makes them unique.

Evans and Mueller Guidelines: The Causality of Cancer of Viral Etiology

The following ‘guidelines’ for assigning a virus to a putative cause of cancer are given in a 1990 paper by these authors [139]29:

Epidemiological principles.

(1) The geographic distribution of infection with the virus should be similar to that of the tumor with which it is associated when adjusted for the age of infection and the presence of cofactors known to be important in tumor development [“Association”].

(2) The presence of the viral marker (high antibody titers or antigenemia) should be higher in cases than in matched controls in the same geographic setting, as shown in case–control studies [“Case–control” study].

(3) The viral marker should precede the tumor, and a significantly higher incidence of the tumor should follow in persons with the marker than in those without it [“Temporality” and “Specificity”].

(4) Prevention of infection with the virus (vaccination) or control of the host’s response to it (such as delaying the time of infection) should decrease the incidence of the tumor [“Counterfactual experiment”].

Virological principles.

(1) The virus should be able to transform human cells in vitro into malignant ones [“Biological Plausibility,” “Coherence”].

(2) The viral genome or DNA should be demonstrated in tumor cells and not in normal cells [“Association”].

(3) The virus should be able to induce the tumor in a susceptible experimental animal and neutralization of the virus prior to injection should prevent development of the tumor [“Biological Plausibility,” “Counterfactual Experiment”].

As we can see, these guidelines cover five of Hill’s criteria, Susser’s “Association” criterion, and, as was also the case with the previous list, the mandatory epidemiological “case-control” study. This complex (as well as Hill’s criteria) is used by the IARC and the US National Cancer Institute [140].

Analogies of Evans and Katz et al.: Causality Criteria in Forensics and Epidemiology

These constructions were formulated by Evans, 1978 [128], and authentically reproduced in the epidemiology manual of R.H. Friis and T.A. Sellers, 2014 [130].

As stated in [128], “another view of the causal relationship of an agent to disease might be framed in legal terms,” and comparative material is presented (Table 1).

Table 1. Rules of evidence: criminality and causality [128] (authentically presented in textbook [130])

There are similar thoughts in the epidemiology manual by D.L. Katz et al., 2014 [141], which are probably independent, without references (Table 2), although the first edition of this manual dates back to 1996, that is, its authors might be familiar with the constructions from A.S. Evans, 1978 [128].

Table 2. Analogy between the stages of a murder investigation and an epidemiological investigation (compiled from data from [141])

Tables 1 and 2 (especially Table 2) give a fictional impression. But, as we see, such questions and analogies are touched upon in very serious sources.

The results of epidemiological studies are often used in court hearings [142–149]. In part, these issues were considered by us earlier: the Daubert rule for a relative risk (RR) of more than two [3, 8]. The difficulty, however, is that epidemiological risks concern groups and populations, and not the individuals [35, 65, 81, 82, 114, 116, 119, 121, 122, 131, 137, 141] (see also [9]).

The first (2002) and third (2016) editions of R.S. Bhopal [131] (we do not have the second one) have a discussion on this subject with a quote attributed to some work by Evans, 1978, and even a page for the quote is indicated. This, however, is not the publication of Evans, 1978 [128], considered above (neither the material nor the page numbers match), although both editions of the specified manual by R.S. Bhopal [131] have the only such reference: just [128]. In connection with this error in the source, which has been replicated for decades, we will have to believe R.S. Bhopal that the material below is indeed A.S. Evans, 1978 (no such publication is found in PubMed or cited in Google).

It is stated like this [131]:

“Epidemiological data are, therefore, difficult (possibly impossible) to apply in legal cases about individuals. To quote Evans discussing the issue in the United States of America: “Legal requirements are concerned with the risk in the individual, the plaintiff, and whether the preponderance of evidence supports the conclusion that that exposure ‘more likely than not’ resulted in that illness or injury in that person (1978, p. 194)”.

Evans contests that a higher order of proof and specificity is required in legal proof than in epidemiological proof, concluding that epidemiological evidence is often inapplicable in this context. Epidemiology is a science based on studies of groups and cannot be directly applicable to individuals, and this is an inherent limitation. Equally, a factor demonstrated to cause a disease in an individual, by a science of individuals, say toxicology or pathology, may not be demonstrable as harmful in the population, possibly because harmful effects are balanced by beneficial ones.30 This is an inherent limitation of a science of individuals. The problem lies not with epidemiology itself, but with those who apply epidemiology in these circumstances. The law also extrapolates from population data to the individual. The standard of proof in epidemiology is not of a lower order than in law, but it is of a different order and for a different purpose. The problem is that so often the best we can offer the individual is average risk derived from the study of groups similar to that individual. That is a limitation of medical sciences collectively. We now consider how epidemiological guidelines for causality help to analyse the causal basis of associations observed at the population level.” 31

It would seem that everything has been said and there is no way out. But here we have an example (as was the case with animal experiments [9]) when Hill’s causality criteria reached where, logically, they cannot reach. From the population to the individual: epidemiological rules of causality, including for a particular person.

CRITERIA FOR CAUSATION OF EFFECTS FROM A SINGLE EPIDEMIOLOGICAL STUDY TO A SINGLE INDIVIDUAL BY P. COLE: 1997

Philip Cole (the name has one “l,” which is important, because there are many others by that name) is a researcher from the United States, at least in 1991, at the Department of Epidemiology, School of Public Health and Comprehensive Cancer Center, University of Alabama at Birmingham [150]. Since at least 1997, he has been a professor of Epidemiology in this department [151]. No other information about this author is found anywhere; a number of the same names in biochemistry, etc., disparate in age are found on the Internet. The only data found for 2020 is from the website of the already-mentioned School of Public Health, according to which Philip Cole was professor emeritus in March of this year [152]. It is impossible to find out whether this is the P. Cole in 2020. He is at least 60 years old in the video of this author’s prophetic speech at a conference in 1996 [153].32

The area of research and activity of P. Cole includes the use of epidemiological data (expertise) in sociology and jurisprudence [142, 150, 151], in connection with which the conclusion about the individual causality of the effect or about individual risk is of particular importance. It is in this context that a number of authors [160163] cite the conceptual and important publication of P. Cole, 1997 [142], titled “Causality in epidemiology, health policy and law.” This paper turned out to be inaccessible to us, but the material on the criteria of causality from it was completely reconstructed from other publications, albeit isolated [62, 164, 165]. Some other sources contained references to the work [142] in relation to the criteria of causality [166, 167].

Different levels of evidence, up to individual, developed by P. Cole, 1997 [142], are exhaustively presented in two publications by P. Lagiou et al., 2005; 2008 [62, 164], of which the second is a chapter in a manual on the epidemiology of carcinogenesis. The recipes (“lists”) are again almost the same, differing in some details. Below are the latest data for 2008 [164], which, according to signs, are the most complete.

The gist of [62, 164] is as follows. It is indicated that the criteria for a causal relationship can be used, explicitly or implicitly, when evaluating the results of a separate (single) epidemiological population study, although in this case it is almost impossible to draw a firm conclusion. In the approach presented in P. Cole, 1997 [142], this situation is referred to as the individual study level, or level I.

More commonly, causality criteria are used to assess evidence gathered from multiple epidemiological studies and other forms of biomedical research, including experimentation. At this stage, the analysis process is inductive, moving from specifics to generalizations (multi-study level, or level II). [We called it the level of comprehensive studies, which is more accurate.]

Finally, when a causal relationship is established at level II, then and only then can the cause of the pathology in a particular person be considered (personal level (specific person level), or level III). At this level, the analysis of the process is deductive, moving from the general concept of causality to the study of what may have caused the pathology [or consequence] in a particular person.

Causality Criteria Used at the Individual Study Level (Level I) [62, 164]

A causal relationship can never be inferred from a single epidemiological study, but the likelihood that an observed relationship is causal increases if some of the following criteria are met:

(1) the minimum contribution of confounding;

(2) the minimum contribution of bias;

(3) limitation of chance variations;

(4) relatively strong association;

(5) monotonic exposure—disease association, otherwise referred to as exposure—response or dose—response association [i.e., “dose–response” relationship];

(6) internal consistency [of association] demonstrated by similarity of exposure—response patterns among subgroups of study subjects;

(7) compatibility of the temporal sequence of exposure and outcome with the presumed latency of the disease;

(8) biological plausibility, i.e., a causal relationship between exposure and disease must be at least biologically possible (cannot contradict physical theory or biological principle).

Thus, at level I, we see the use of the criteria “Association,” “Strength of Association,” “Consistency of association,” “Temporality,” “Biological Plausibility,” “Coherence with Current Facts and Theoretical Knowledge,” and “Biological Gradient.” There are six out of nine of Hill’s criteria and one of Susser’s criterion. It is important to use the “Consistency of association” test at the level of a single study after group stratification. In this case, there is a resemblance to the first two “geographical” criteria mentioned above (see note 5) from MacMahon et al., 1960 [53]: the presence/absence of the effect in different ethnic and professional groups, social classes, gender, etc.

P. Cole’s first level prescription gives the researcher instructions on how to manage the results of even a single study to make them the most compelling.

Causality Criteria Used at the Level of Integrated Studies (Level II) [62, 164]

Establishment of the etiologic role of a particular exposure on the occurrence of a disease ideally requires strong epidemiologic evidence, an appropriate and reproducible animal model, and documentation at the molecular or cellular level of the morphological or functional pathogenetic process. Sometimes, an intended or unintended change, or natural experiment, greatly facilitates etiologic inference: this happens when, for example, an occupational group is exposed to high levels of compounds rarely encountered in other settings,33 a religious group avoids an exposure that is otherwise widespread, or a vaccine that creates herd immunity against a particular virus turns out to reduce the incidence of a certain form of cancer.

However, these experiments, approaches, and observations are rarely performed all together. Instead, the best available biomedical evidence should be used to correctly interpret the results of several epidemiological studies. The following criteria must be taken into account here:

(1) consistency, that is similarity of results obtained by different investigators using different study designs in different populations;

(2) for weak associations the biomedical evidence [“Biological Plausibility”] must be overwhelming, whereas for very strong associations reliance on powerful biomedical knowledge is less critical;

(3) compatibility of exposure-response patterns across different studies exploring the exposure–disease association in different exposure ranges;

(4) coherence, which requires results from analytic epidemiologic studies to be compatible with ecologic pat-terns and time trends, such as the increasing incidence of lung cancer over time, following the increasing use of tobacco products by the population;

(5) specificity, which exists when one type of disease is consistently linked with one type of exposure rather than several exposures all being associated with a certain disease, or one type of exposure being associated with several diseases;

(6) biological analogy, which exists when a similar exposure has been shown to cause a similar disease in another species or a different form of the disease in humans. For example, viruses have been shown to cause leukemia in several animal species and at least one rare form of leukemia in humans.

It is noted that none of these criteria can be considered as absolutely necessary for a causal inference (sine qua non). But the evidence for causation is strengthened when most of them are fulfilled [62, 164].

Therefore, almost all of Hill’s criteria are used at the second level. Only two are missing: “Strength of association” and “Temporality,” which should have been analyzed at level I for each individual study.

Causality Criteria Used at the Personal Level of a Specific Person (Level III) [62, 164]

Causality can be conclusively established between a particular exposure as an entity, and a particular disease as an entity. In contrast, it is not possible to establish such a link conclusively between an exposure and a particular disease of a given individual—for example, smoking in a patient with lung cancer. It is possible, however, to infer deductively that the specific individual’s illness was more likely than not caused by the Specified exposure.

To draw this conclusion, all of the following criteria must be met [142]:

(1) The exposure under consideration, as an entity, must be an established cause of the disease under consideration, as an entity (level II).

(2) The relevant exposure of the particular individual must have properties comparable (in terms of intensity, duration, associated latency, etc.) to those that have been shown to cause the disease under consideration.

(3) The disease of the specified person must be identical to, or within the symptomatological spectrum of, the disease that, as an entity, has been etiologically linked to the exposure.

(4) The patient must not have been exposed to another established or likely cause of this disease. If the patient has been exposed to both the factor under consideration (for example, smoking) and to another causal factor (for example, asbestos), individual attribution becomes a function of several relative risks, all versus the completely unexposed:

(a) RR of those who only had the exposure under consideration;

(b) RR of those who had only been exposed to the other causal factor(s);

(c) RR of those who have had a combination of these exposures;

(5) RR should be reasonably elevated (e.g., 2 or more).

The last criterion stems from the fact that RR includes a base component “1” that characterizes the unexposed plus another component that applies only to the exposed. When 1< RR < 2, then the exposed person who develops a disease is more likely to become ill for reasons not entirely related to the exposure. For example, if a 55-year-old male smoker has a 6% risk of having a first heart attack in the next five years, and a 4% risk for a nonsmoker of the same age (RR = 1.5), then only a 33% risk for a smoker (i.e., 1/3 of the total of 6%) can be attributed to his smoking. But when RR > 2, then the particular person who was exposed and acquired the pathology in question is more likely to become ill from exposure than from other causes [62, 164].

Constructions with the value of RR, on which the “Daubert rule” used in the US courts is also based, were considered in more detail earlier [3, 8].

In our opinion, the not-too-well-known and cited manual of P. Cole, 1997 [142], is of great importance for various expert councils establishing a causal relationship between occupational exposures and pathologies. Everything is broken down into points. The name of P. Cole (epidemiologist) was not found in the Russian literature, or in most Western textbooks on epidemiology, or in the two Oxford dictionaries mentioned in this discipline [121, 122].

So, to reiterate, Hill’s ubiquitous criteria were able to reach even the level of a particular individual.

PRINCIPLES FOR ESTABLISHING INDIVIDUAL CAUSALITY IN MEDICAL EXPERTISE: R.E. GOTS (1986)

In principle, it would be more appropriate to consider these principles [169] in the previous subsection with similar constructions of P. Cole from epidemiology. In addition, the formal chronology of presentation requires that R.E. Gots, 1986 [169], come before P. Cole, 1997 [142]. But the singularity and exclusivity of the guidelines for establishing probabilistic causation for the individual personally (medicine, expert advice, and jurisprudence) made it appropriate to make a separate subsection. At the same time, the constructions of R.E. Gots are still inferior to P. Cole.

The publication by R.E. Gots, 1986 [169], discusses the issues of causality in medicine for use in the practice of forensic science. The frequent absurdity of the declared links chemical agent–pathology (cancer, etc.) is pointed out. It is noted that “It takes very little in the way of ill-founded testimony by a physician to support a jury’s belief, despite the lack of any scientific validity for that belief” [169].34 In this regard, the author lists the following “The proper principles of the methodology of causation analysis” for an individual, which are in many ways similar to the criteria of Hill and the criteria of Susser. So [169],

“Can the agent in question produce the disease at issue? [in an individual]?

(1) Is there substantial and properly relevant animal data? [“Biological plausibility” criterion.]

(2) Is there human evidence, particularly epidemiological support? [Partly the “Association” criterion.]

Did it cause it in this case?

(1) Have other causes been properly considered and ruled out? [The ‘Lack of alternative explanations’ criterion, which is discussed below.]

(2) Has the exposure been confirmed?

(3) Was the exposure sufficient in duration and concentration? [“Strength of association” criterion.]

(4) Was the clinical pattern appropriate? [According to Hill [10], “Coherence with known facts from the natural history and biology of the disease” criterion.]

(5) Is the morphological pattern appropriate? [According to Hill [10], “Coherence with known facts from the natural history and biology of the disease” criterion.]

(6) Is the temporal relationship appropriate? [“Temporality” criterion in philosophical terms.]

(7) Does the latent period of the disease correspond? Is the latency appropriate? [“Temporality” criterion in epidemiological terms.]35

Paper [169] does not mention either Hill or Susser, but there is a reference to the postulates of Henle–Koch (see [2]) for infectious diseases and to the above postulates of causality from Evans [127] for all pathologies. It is clear that the list of Gots [169] implies four of Hill’s criteria.

The study by Gots, 1986 [169], is included in USEPA’s extensive historical online data summary of personalities who developed the rules of causality in epidemiology, toxicology, and ecology [170] (part of this summary is also presented in the monograph on causality in ecology [33]).

CAUSALITY CRITERIA “BY K. POPPER” AND HELP FROM META-ANALYSIS: D.L. WEED (1985–2008)

Douglas L. Weed, the epidemiologist and expert in the field of causation of effects in various fields constantly cited by us in almost all previous reports and reviews on the topic [1–3, 5–9], is our contemporary (the latest publications in PubMed are dated 2018).36

The first paper by Weed registered in PubMed appeared in 1983, and it is focused on the ethics in preventive medicine, but soon this author delved deep into the problems of causality in epidemiology, and in the 1980s, let us say, he became deeply interested in “the epidemiology of K. Popper” (1985–1988) [36, 37, 173175], rare echoes of which occurred in his works in 1997 [39] and even in 2008 [176]. But, as we assessed earlier [9], initially fashionable, certainly true in terms of the philosophy of evidence, but virtual and far-fetched in practice, ‘Popperian Epidemiology’ in the 2000s–2010s has almost vanished. In any case, only a few lines are devoted to this issue with all the same references from the 1970s–1980s in many voluminous manuals on epidemiology [9].

The essence of ‘Popperian Epidemiology’ has already been cited by us [9]: the main thing is to have an initial hypothesis before starting an observational study, which should be disproved–countered or not. If not, then it is necessary to find a new approach in order to disprove–counter this hypothesis, or put forward a new hypothesis, which also needs to be disproved–countered, etc. It is impossible to confirm, because it would be unscientific and untrue (see, for example, ‘Popper–non-Popper Epidemiology’ in the collection of materials of the 1988 symposium [177]).37

Hill’s predominantly inductive criteria do not support the approaches of K. Popper (only “Temporal dependence” is uniquely deductive [36]), and, therefore, D.L. Weed considered epidemiology in the second half of the 1980s as “neglecting deductive logic,” which logically “is the central [element] of all scientific progress,” and also “induction is a logic whose foundation is shaky; therefore, it is weak logic for science” [37]. Along the way, the views of inductivists, who argue that the probability of the truth of a theory increases with the accumulation of evidence in its favor and that science is not interested in refuting existing hypotheses (we add: it is not clear how they appeared in the minds, according to K. Popper), but in creating new ones, were denied [178].38

In 1988, Weed derived two additional criteria that are “K. Popper alternative to Hill’s criteria” from his constructions [36, 175]: “Predictability” and “Testability.”

Predictability” means that once a causal hypothesis has been proposed, certain predictions can be deduced from it in preparation for comparing them with empirical observations. This criterion does not depend on the particular form of the causal hypothesis, and the basic methodology will always be the same: one must propose a hypothesis and make predictions from it. However, this criterion alone is not enough, since the hypotheses must be testable. The strategy for improving the testability of a hypothesis is to increase the accuracy of its predictions [36].

We should note that this sentence of Weed repeats the criterion of Susser “Predictive performance” discussed above [87, 88, 100, 102].

Testability” is that more accurate predictions show not only which observations are compatible with the hypothesis, but also which observations are inconsistent with it, i.e., which observations test the hypothesis [36].

Weed notes that all of Hill’s criteria, with the exception of “Analogy,” fall under the two criteria “by K. Popper” proposed above (“Analogy” is a way of coming up with a hypothesis) [36]. “The criteria for predictability and testability are better at explaining and correcting things. Therefore, they represent progress in methodological knowledge” [175].

However, the author himself felt some artificiality of his constructions when applied to the practice of assessing the causality of the effect for the subsequent response. (“Whether the criteria of predictability and testability make our lives more complex. In my opinion, the answer is both no and yes.”) [175].

Despite the attempts of Weed, in so many words, to defend the practicality of the above criteria and their applicability [36, 174, 175], it is clear that their use and dissemination in epidemiology has been called into question. And we no longer found the criteria “Predictability” and “Testability” in the further works of this researcher on the causality of effects, with the exception of a brief mention in the 2008 work [176].

But Weed has not stopped [46] his research on the development and improvement of the set of Hill’s criteria, giving them, apparently, great importance. The latter is important, given the authority of this author in the practice of causal establishments not only in epidemiology, but also in oncology, sociology, jurisprudence, etc. (see note 33). In 2000 Weed proposed adding meta-analysis and a systematic review to Hill’s criteria [41], and then his work was published on the prospects for using meta-analysis to strengthen the evidence of a criterion [42]. The author of [42] sequentially enumerates all Hill’s criteria.

The function of meta-analysis for “Strength of association” is trivial: it provides the closest unifying risk estimate for a number of studies to the truth. While D.L. Weed reiterates that association causation is appropriate to consider only for RRs of 2.0 and above [42], constructions of which were discussed in detail earlier [3, 8].

The reproducibility of the effect for “Consistency” is usually assessed (under different conditions, by different authors, with different designs, etc.; see [8]) by calculating the percentage of studies with positive and negative results, according to a simple majority or exceeding a certain threshold effect. Thus, the result may depend on the chosen evaluation principle. Meta-analysis, on the other hand, has a standardized methodology and well-established pooling scoring models depending on the degree of sample heterogeneity [179].

Weed finds a trivial function of meta-analysis for “Biological plausibility,” which is associated with various experimental and observational confirmations at all levels of biological organization [6, 9], in that it helps a set of similar studies to obtain the best pooled estimate. The author says that it “seems unlikely that meta-analysis in its current quantitative form will be useful for summarizing different kinds of studies from different levels of biological knowledge” [42]. But here we forget about such an approach as “Bayesian meta-analysis,” based on the integration of data from various disciplines [180, 181] (see also our reviews [6, 9]).

The “Biological gradient” criterion, i.e., the “dose–response” relationship, can be directly related to a meta-analysis that combines different studies with different exposure levels [42]. Thus, there is a special approach: meta-analysis for dose dependences [35, 119, 182, 183].

For the remaining criteria, there is no inventive potential of the author in [42] to apply meta-analysis to them.

“THE CONSEQUENCE [OF CHOOSING]” CRITERION: J. OLSEN AND U.J. JENSEN, 2019

A recent Danish review by Jorn Olsen and Uffe Juul Jensen [184], a seemingly respected epidemiologist [51, 55] and philosopher of science from Aarhus (see online), examines Hill’s criteria. There are arguments that some points are outdated and they need to be revised, but there are no instructions on which ones, and it turns out like the classics: “But for some reason there is no address.” In [184], once again, the philosophical absurdity of considering randomized controlled trials within the framework of observational criteria, i.e., what is the last proof of the effect and does not require any other criteria, is encountered (we discussed it in detail in [9]).

Finally, a kind of social criterion “Consequence” (or “Consequence”–‘Sequence’) is proposed. It is said that “acting in accordance with adequate procedural criteria does not always secure that adopted consequences will be accepted (‘in real life’) as appropriate (right, just, fair etc.)” and that it is necessary to take into account the consequences of the decision, if society and its institutions will or will not act on its basis [184].

The authors of [184] gave only two examples as the attempts to explain what is meant (although it is so clear). One is from Hill, 1965 [10], according to which, “very strong evidence” was needed in order to “made people burn a fuel in their homes that they do not like or stop smoking the cigarettes and eating the fats and sugar that they do like.” The second example concerns the decision to introduce a new vaccine, which, despite limited data on it, should be taken if the risk of inaction is considered to outweigh the risk of using the vaccine [184] (relevant now).

In short, the authors of [184] propose to include the researcher’s responsibility for this assessment in the set of criteria for assessing the causality of an effect. This approach removes the process of determining causality not only from the scientific sphere, but even from the previously considered “precautionary principle” for sociology and healthcare [1]. It turns out to be some kind of reinforced, militarized “precautionary principle.”

It is unlikely that such developments are of practical importance, since it is not clear how to perform them in reality, as in the case of the guidelines of Susser and Weed.

CRITERIA OF D.L. SACKETT “BIOLOGICAL SENSE” AND “EPIDEMIOLOGICAL SENSE” (1978)

A.R. Feinstein, 1979 [185], D.L. Weed, 1988 [36], and D.L. Weed and S.D. Hursting, 1998 [40] indicate that David L. Sackett named special causal criteria in an paper included in the collection of epidemiology from 1978 [186] (publication not available to us). D.L. Sackett is considered the founder of both clinical epidemiology and evidence-based medicine [119]. According to [36], D.L. Sackett [186] proposed to replace the criterion “Biological plausibility” with “Biologic sense” and the criterion “Consistency” “Coherence” [with current facts and theoretical knowledge] with “Epidemiologic sense.” We missed this point in previous publications discussing these criteria [6, 9], although it should be said that this proposal did not lead to any “change of milestones” and almost never appears anywhere else. Indeed, it is difficult to limit both criteria to only biological or only epidemiological ones.

However, those wishing to use such causality criteria in their research can always refer to an authoritative publication [186].

NO ALTERNATIVE EXPLANATIONS AND THE BRISTOL CRITERIA: 1980–2019

Meaning of the Term

The item “alternative explanations” is usually not included in the causality criteria in epidemiology textbooks (e.g., [114, 187]), but it is specified that there are “three alternatives” to explain the association: biases, confounding factors (confounder), and random error (chance). In the Oxford Dictionary of Epidemiology, the term is associated only with confounders [122], which is a flaw. In general, the construction “alternative explanations” without clarifications (probably it is already clear) is found in many epidemiological sources (for example, in manuals [119, 131, 141]). An example of the use of this principle in the list of principles of individual causality for medicine and forensic practice was discussed above [169].

The Point “No Alternative Explanations” as Equivalent among Hill’s Criteria

Over the past few decades, the causality criteria used by the IARC have represented practically the traditional Hill’s complex (1987–2012) [140, 188, 189] (and others). But the following set of criteria is presented in a special sequence in a 1980 paper by N.E. Breslow and N.E. Day [190]:

(1) ‘Dose response’;

(2) ‘Specificity of risk to disease subgroups’;

(3) ‘Specificity of risk to exposure subcategories’;

(4) ‘Strength of association’;

(5) ‘Temporal relation of risk to exposure’;

(6) ‘Lack of alternative explanations’—that is, confounders and biases that affect the association.

(7) “Points that cannot be considered on the basis of a single study” (‘Considerations external to the study’)—“Consistency of association,” “Counterfactual experiment,” “Biological plausibility” + “Coherence with current facts and theoretical knowledge.”

It can be seen that all Hill’s criteria are included, except for “Analogy,” but another point is added with the proof of the absence of interfering factors (confounders) and biases “Alternative explanations.”

The named complex is given by the IARC-1980 document [190] with reference to the Report on Smoking and Health from 1964 [79], the publication of Hill, 1965 [10], and the work of J. Cornfield et al., 1959 [191]. The last author is one of the founders of the theory of confounding factors (“the third factor” [191]) with the corresponding gradation of RR, which was considered by us earlier [3, 8]. It turns out that the introduction of the point with “alternative explanations” of the association is probably an attempt to add J. Cornfield’s constructions to Hill’s criteria.

A paper similar in title to the IARC-1980 document [190] was found by the same authors, N.E. Breslow et al., 1978 [192], but it does not contain any criteria for causality. Thus, the IARC-1980 publication [190] is the first to introduce the criterion “Absence of alternative explanations,” according to our data.

Meanwhile, this point is superfluous according to the logic and philosophy of causality. It resembles a fairy tale about “soup/porridge from an ax”: after all, all the causality criteria for observational studies (Hill’s, etc.) are aimed precisely at assessing the probability of the absence of chance, confounders, and biases effects in the association [58]. Correct arguments that it is Hill’s criteria that serve to eliminate “alternative explanations” were found by us only in the epidemiology manual by M. Szklo and F.J. Nieto, 2019 [35]. After all, it is impossible to eliminate completely unknown confounders under any approach, even in a randomized controlled trial (randomization is aimed, among other things, at an attempt to “balance” unknown confounders and other uncertainties) [35, 82, 119, 168], not to mention observational designs [193, 194].

There would have been a single document [190] with the indicated superfluous criterion if similar constructions had not appeared later, which were included in a very popular manual and even received a magnificent name.

In 1991, this set of criteria from IARC-180 [190] was reproduced in the document of the Committee on Carcinogenicity of Chemicals in Food, Consumer Products and the Environment from the UK Department of Health [195].

The sixth point in the causality criteria from the fifth [82] and sixth (posthumous) [196] editions of the popular epidemiology textbook authored by Leon Gordis, in addition to Hill’s eight statements (except, as usual, “Analogy”) is “Consideration of alternate explanations,” again, on equal footing with the rest of the criteria. The point is again to eliminate the influence of “third” factors [82, 196], although, as mentioned above, this is like a “fifth wheel to the cart” for Hill’s criteria.

In the fifth edition of the manual [82], L. Gordis himself, then still alive, cited the origins: in 1986, the commission of the US Department of Health proposed the named complex when assessing the effect of prenatal measures, the criteria for which were published in L. Gordis et al., 1990 (chapter in the monograph) [197]. The authors of [82, 196, 197] do not overlap with the authors of IARC-1980 [190]; in the 1980s L. Gordis has not yet published anything of the kind in relevant works on the subject [198, 199]. It is unlikely that the coincidence of the provisions in [190] and [82, 196, 197] is the result of parallel insights; most likely, “someone once subtracted something from someone,” as was the case with the criteria from R.A. Stallones [111] (see [2]).

However, the inclusion of the “Lack of alternative explanations” point in Hill’s criteria, probably due to L. Gordis’s help, can also be found in relatively recent works on causation, for example, dated 2008 [200] and 2015 [201].39 There is a similar point in some versions of the modified Hill’s criteria for ecology and ecotoxicology [33, 145, 202205] (details below).

Point “No Alternative Explanations” as Additional to Hill’s Criteria; “Bristol Criteria”

The point “Lack of alternative explanations” together with a few more points (below) was included as additional criteria after Hill’s complex and was called the “Bristol criteria” through the efforts, apparently, of Andrew G. Renehan from Manchester, who investigated (with coauthors) the relationship between overweight people and the frequency of cancer [194, 206208]. These papers cited Lawlor et al., 2004 [209], where there were corresponding constructions, although not terminologically structured. Why the Bristol Criteria? Because the authors of the publication [209] primarily referred to the university in Bristol, United Kingdom. They did not give it such a name.

Thus, the “Bristol criteria” of causation, in addition to Hill’s nine criteria [194, 206208], are the following:

• “Appropriate adjustment for key confounding factors”;

• “Measurement error”;

• “Assessment of residual confounding”;

• “Lack of alternative explanations.”

In the presentation by A.G. Renehan, 2016 [206], three of Hill’s criteria, i.e., “Temporality,” “Strength of association,” and “Specificity,” as well as two of the “Bristol criteria,” the first and the last, are noted as the most important.

The Bristol Causation Criteria, in fact, are limited to the material cited. Other “Bristol criteria” that are not related to causation, but to the diagnosis of osteomyelitis, are identified through PubMed [210] (and others; three references) or to surgery [211]. The search for material in Russian publications identified only a certain “Bristol scale for the shape of a stool” (Russian homonym: “stool” is a chair and is a stay) coupled with the “Roman criteria,” where a “stool” (chair) is not a piece of furniture, but the criteria are also relevant [212] (and others). All such “mimicry” only confuses.

Summing up, we can say that these “fifth wheels from the cart,” which are considered for some reason as additional to Hill’s criteria, are not criteria, but either the necessary methodological approaches of an observational study (the first three points), or the purpose of applying Hill’s criteria themselves (“Lack of alternative explanations”). Perhaps this heaping is justified for some special studies [194, 206209], but it is hardly correct from the standpoint of the philosophical concept of causality.

THE ENVIRONMENTAL CRITERIA OF CAUSATION: POSTULATES OF HENLE—KOCH AND THE CRITERIA OF A.B. HILL AND M. SUSSER (1979–2020)

The collected relevant material is large, clearly of interest for radioecology, and deserves a separate review. But our tasks do not include a complete consideration of the intricacies of evidence and the specifics of the relevant methodologies in the field of biota ecology, as well as in human ecoepidemiology and ecotoxicology. At the same time, we can present the main historical milestones for these disciplines, as well as briefly summarize the results and key points of almost all work related to the evidence-based approach based on causal guidelines/criteria.

Historically, the online publication on the USEPA website within the framework of the Causal Analysis/Diagnosis Decision Information System (CADDIS) was developed by this organization, which lists the authors and stages of formation of causal approaches not only for epidemiology and ecology, but also in scientific and philosophical plan in general [170]. According to the reference in the monograph [33] (which briefly reproduces this historical outline), the author of this material [170] is Glenn W. Suter II from USEPA. The second important source of this plan is this same monograph [33].

A summary of the collected data (the completeness of the sources seems to be exhaustive) is presented in Table 3. Our task was only to reflect the use of causal criteria at different times in these environmental disciplines. We believe that ecologists and radiation ecologists themselves will be able to assess what data and sources are relevant to them, which, in our opinion, justifies the large size of Table 3. The collection seems to be quite complete.

Table 3. Summary of data on the application of causation principles and criteria in Ecology, Ecoepidemiology, and Ecotoxicology

Table 3 shows, as was previously shown for epidemiology [2], the overwhelming contribution of authors from the United States to the development of causal criteria: 69% of the sources. Second place went to Canada: 14% of works (total 83% for North America over more than 40 years). In the historical review [2], we noted the absurdity associated with the fact that almost all known criteria attributed to the British A.B. Hill were, in fact, proposed by authors from the United States. Now again one can see the complete predominance of American developments for causality in ecology, ecoepidemiology, and ecotoxicology.

It also follows from Table 3 that the total dominance of inductive causal criteria, both initial and modified, and with additions due to the specifics of the disciplines, in ecological disciplines. It can be seen that the researchers tried to adapt almost all known causal principles, starting from the postulates of Henle–Koch, which at first seemed to be the most adequate due to the analogy of an infectious agent in the body or a pollutant in the body. Hill’s and Susser’s criteria were then used, and on a much larger scale for the latter than for the actual epidemiology for which they were developed. Starting from 1979 (Henle–Koch postulates) [214] and up to 2020 (Hill’s criteria) [264], there has been a permanent increase and improvement of the guiding principles of causality in relation to environmental disciplines. Table 3 shows that it is possible to find complexes of criteria and evidence-based principles, as it were, for any desired conjuncture in a quantitative and qualitative sense. The predominance of evidence-based methods based on Hill’s criteria has been observed for such an authoritative organization as USEPA from 1992 [217] to the present day [264], and also for the large-scale WHO international program on chemical toxicants in the environment (The International Program on Chemical Safety, IPCS) [202, 229231].

Thus, the criteria of causality appear to be both ubiquitous and, as it were, immortal, and the latter applies even to the postulates of Henle–Koch [2]. This is the case despite all the criticism of inductive approaches to causality [6, 9] (we plan to consider it in more detail in the 2nd part of this report). Despite critical questions to each item like “And if he was carrying [weapon] cartridges?” from the authorities of K.J. Rothman and S. Greenland [1622], leaked into textbooks on epidemiology (see [6]), on similar constructions of some authors even from USEPA (Cox, 2018 [265]40), denying induction, “true-scientific” “Popperian Epidemiology” is based on the hypothetical-deductive method [9].

Our attempt to identify relevant domestic environmental studies that dealt with the rules/criteria of causation yielded only one source, a manual on environmental epidemiology from a university in Saratov from 2015 [262].

It was not possible to find anything on the topic for radiation ecology (SCEAR-2008 [268], etc.).

It will be useful to finish the section with an example of the modification of Hill’s criteria developed for the mentioned international program on chemical toxicants in the environment (IPCS), launched in 1980 under the auspices of WHO, the International Labor Organization (ILO), and the United Nations Program for environmental protection (UNEP). In 2001, the WHO/ILO/UNEP IPCS published a framework for assessing the MOA (“Mode of Action”; see Table 3) of carcinogenic agents in laboratory animals, followed by risk extrapolation on people [205, 241]. Further, this methodology was improved (or simply changed, we cannot say for sure) [93, 202, 204, 205, 239241, 260].

In the last prescription known to us (WHO/IPCS), Meek et al., 2014a [260], provide the following steps based on “modified Hill’s criteria” for Weight of Evidence applied to hypothetical Modes of Action (MOA):

(1) Concordance of dose-response relationship between key and end events:

(a) dose–response relationships for key events would be compared with one another and with those for the endpoints of concern;

(b) are the key events always observed at doses lower or similar to those associated with toxic outcome?

(2) Temporal association:

(a) key events and adverse outcomes would be evaluated to determine if they occur in expected order.

(3) Consistency of association and Specificity [the meaning of the latter is not similar to Hill’s criterion]:

(a) Is the incident of the toxic effect consistent with that for the key events? (or less than for key events?;

(b) Is the sequence of events reversible if dosing is stopped or the key event is prevented? [The Counterfactual experiment.]

(4) Biological Plausibility [and ‘Coherence’]:

(a) Is the pattern of effects across species/strains/systems consistent with the hypothesized mode of action (MOA)?

(b) Does the hypothesized mode of action (MOA) makes sense based on broader knowledge (e.g., biology, established mode of action)?

All stages, which, as we see, include six of Hill’s criteria (“Specificity,” as mentioned, has a different meaning here), involve animal experiments, which are then extrapolated to humans (IPCS Human Relevance Framework) [93, 204, 205, 239241, 260] in the following paragraphs [240]:

(1) Is the weight of evidence sufficient to establish a MOA in animals?

(2) Can human relevance of the MOA be reasonably excluded on the basis of fundamental, qualitative differences in key events between experimental animals and humans?

(3) Can human relevance of the MOA be reasonably excluded on the basis of quantitative differences in either kinetic or dynamic factors between experimental animals and humans?

(4) Conclusion: statement of confidence, analysis, and implications.

In general, it should be said that publications under IPCS and USEPA come across as verbose, loose, poorly structured, constantly changing in important and minor points (without explanation), and obscured by a lot of abbreviations that the authors have an excessive penchant for. The mass of publications on the topic (see Table 3) does not lead to the formation of a unified view and a standardized methodology, although we have tried to give an excerpt above. But the very questions of applying Hill’s criteria to experiments on animals, with the subsequent transfer of the revealed patterns to humans, seem to be very important and fundamental, in particular, for the disciplines of the radiation profile.

CONCLUSIONS

As a rule, references are not given in this section; they can be found above.

The topic of Part 1 of Message 4 follows logically from the entire previous material in the series [19]. Having considered the general models and definitions of causality in philosophy, medicine, and epidemiology [1], as well as all Hill’s causal criteria separately [39], earlier in a historical review [2] we outlined the origins of their appearance in epidemiology, listing the true pioneers, authors “before Hill.” But even “after Hill,” a number of authors and organizations continued their attempts to improve the methods for assessing the causality of associations and effects in biomedical disciplines. If the review with materials “before” [2] can indeed be called “historical,” then the materials “after” have a completely modern practical meaning, and the constructions and criteria, at times, were proposed by our contemporaries and are significant for the current moment.

An analysis of the vast literature over 55 years on interpretations, assessments, and other aspects of causal criteria (as it was said in [107], quite “Talmudic literature”) revealed frequent references to dozens of other authors, in addition to A.B. Hill. However, almost all of them turned out not to be creators or those who improved or modified the manuals, but simple users or, let us say, original eliminators of certain items without explanation. This applies even to B. MacMahon (with co-authors), “the founder of manuals on modern epidemiology” of chronic diseases.

And it turned out that there were very few “creator” authors after A.B. Hill: M. Susser, A.S. Evans, D.L. Weed (quite known), P. Cole (little known), and some other researchers, whose narrow and dubious proposals were considered, in all likelihood, only by us. A separate large array includes the causality criteria in biota ecology, as well as in human ecoepidemiology and ecotoxicology, which have been developed since 1979 by introducing and modifying the guidelines from epidemiology.

Despite his wide popularity and his authority in medical and social epidemiology, M. Susser, in fact, only introduced three of the most self-evident criteria as mandatory: “Association” (or “Probability” of causality), “Temporal order,” and “Direction effect.” Two more criteria by Susser were associated with attempts to include Popper’s constructions in epidemiology: “Surviability of a hypothesis” when it was tested by different methods (it was included in the refinement of Hill’s previous criterion “Consistency of association”) and “Predictive performance”: the ability of a hypothesis derived from analyzed association to predict an unknown fact. In our opinion, these additional principles are of a theoretical nature and are hardly applicable in practice, since the first does not provide clear framework-limitations (although it is useful), and the second is directed somewhere into the future. Meanwhile, the determination of epidemiological causations usually serves the purposes of public health and safety and should provide an exhaustive answer in a given time period, and not be “confirmed” sometime in the future. Nevertheless, some points of Susser were later included in the criteria for the causality of effects for ecology.

The universal postulates of Evans for infectious and noninfectious pathologies, published by this author in various lists, include mainly ten items that can be called exhaustive, but neither epidemiology nor any other discipline, except perhaps the sphere of infectious pathologies, due, probably, to complexity and difficulty for use [133].

A well-known specialist at the intersection of biomedical disciplines, law, commerce, and politics, D.L. Weed, who was fond in the 1980s of “Popperian Epidemiology,” proposed at that time two relevant criteria: “Predictability” and “Testability” of the causal hypothesis derived from the association of interest. Time has shown that these criteria are not viable, and the same counterarguments can be attributed to them as to the similar guiding principle of Susser.

In our opinion, the little-known criteria of P. Cole (1997) [142] developed by him for judicial practice are the most important for the practice of epidemiology and radiation epidemiology and for expert advice on establishing the causality of occupational pathologies. The three parts of approaches based on various Hill’s criteria are important in that they go from a single epidemiological study through a cycle of such (together with the integration of data from other biomedical disciplines) to the most important thing, methods based again on Hill’s criteria for assessing the individual causality effect. These constructions complement the earlier guidance from R.E. Gots, 1986 [169], on establishing probabilistic causality for the individual personally in medical and forensic practice.

Finally, a very voluminous array presented causal criteria and summaries of causality guidelines for environmental disciplines. A brief analysis of an exhaustive, in our opinion, in its completeness summary of sources (1979–2020) revealed the total dominance of inductive causal criteria in environmental disciplines, both initial and in modifications and with additions due to the specifics of the disciplines. Adaptations of almost all known causal schemes have been found, ranging from the postulates of Henle–Koch to the criteria of A.B. Hill and M. Susser in various modifications, including in international programs and in USEPA practice. Moreover, it turned out that the Hill’s criteria are used in the framework of the program of the WHO and other organizations on chemical safety to assess causality in animal experiments for subsequent extrapolation of the identified patterns to humans.

Data on the assessment of the causality of effects in ecology, ecoepidemiology, and ecotoxicology, coupled with the use of Hill’s criteria for animal experiments, are of significant relevance not only for radiation ecology, but also for radiobiology.

In the next two parts of this report, we plan to consider the remaining issues: ways to classify and determine the weight of specific criteria, their criticism, breadth of distribution, and other non-criteria-based methods/models for determining the causality of effects in biomedical disciplines.

NOTES

(1) The first author of a key textbook in 1960 [53], Brian MacMahon (1923–2007), is very well known in Western epidemiology and there is ample information about him [48]. Although there is no mention of it in the five Russian textbooks on epidemiology that we have, there is a Russian translation of the 1965 textbook by MacMahon et al., 1960 [53] according to the reference in [54] (in the network, however, it is not detected). There was at least an obituary about the third author of this publication, Johannes Ipsen (1911–1994) [55]. J. Ipsen, a native of Denmark, was the first professor of epidemiology in this country, and he worked in the United States in the years 1938–1970 [55]. No information is found on the Internet about the second author, Tomas F. Pugh, who turned out to be the only co-author of B. MacMahon in the subsequent revised manual dated 1970 [56]. According to the publications of this researcher, retrieved on PubMed (not shown), T.F. Pugh was Chief of Medical Statistics at the Massachusetts Department of Mental Health and Professor of Epidemiology at the Harvard School of Public Health in 1967 and 1969. These details are given because the very widely cited manuals [53, 56] (and others similar from B. MacMahon; below) are considered as achievements of precisely B. MacMahon, which catches the eye. Perhaps it was so, but T.F. Pugh, who is now unknown to anyone, is still listed as the second co-author in two manuals. The latest work by T.F. Pugh in PubMed is dated 1969 (hereinafter are the publications of the full namesake radiologist).

(2) For some reason, the number of errors in citing such significant works by B. MacMahon (with various co-authors) is large, especially since there are also publications of MacMahon similarly on the topic, Trichopolous, D., 1996 [57], and MacMahon, 1997 [58], in addition to the manuals of 1960 [53] and 1970 [56]. We do not have the original of any edition of 1960–1997 (the necessary data from them have been reconstructed), but errors in references, quoting, and even in meanings are completely revealed. In the years of publications, the authorities of O.S. Miettinen [52] and D.L. Weed [36] were mistaken in the title [59] and in the number of pages [60]. Finally, as will be discussed below, the materials presented in different publications are quantitatively and qualitatively confused according to the “geographical” criteria and according to the criteria of causality [61, 62]. Almost all of the links provided are from highly reputable authors.

(3) According to [67], J. Parkin “Epidemiology: or, the remote cause of epidemic…,” 1873 and V.C. Vaughan Epidemiology and Public Health…, 1922 were the very first manuals. The Principles of Epidemiology and the Process of Infection, from 1931 by C.O. Stallybrass [71] is an important manual, because this researcher also considered non-infectious pathologies, pointing to a “very marked increase in the incidence of cancer,” which she associated with the growth of industry [68]. It should be noted that the manual by C.O. Stallybrass, 1931 [71], was translated in the Soviet Union in 1936 [72]. The first Russian manual on epidemiology was published in the late 1920s: this was D.K. Zabolotny, 1927 [73]. Then G.F. Vogralik “Teaching about epidemic diseases,” 1935 and V.A. Bashenin Course of General Epidemiology, 1936 were published (see [54, 70]). As for radiation epidemiology, we consider S. Wing, 1994 [74], as the earliest known on-line manual. Further, according to our data, there are corresponding chapters in the voluminous periodicals Cancer Epidemiology and Prevention (Boice, J.D., Jr., 2006 [75] and Berrington de Gonzalez, A., et al., 2018 [76]), as well as a chapter in manuals on general epidemiology by Zeeb, H. et al., 2014 [77]. Other materials of this kind are not educational, and most of them are documents of international and internationally respected organizations (UNSCEAR, ICRP, IARC, BEIR, and COMARE). Fragments of radiation epidemiology in Russian can be found in the first and third volumes of the four-volume book Radiation Medicine (2004 and 2002, respectively), in manuals on radiobiology (1977–2012), and on radiation hygiene (2010–2017). References to these known sources are not given.

(4) The bibliographic site of the University of North Carolina provides a summary of data for manuals (textbooks) on epidemiology for 1970–2000 [78], and there are not always well-known editions of MacMahon et al. Taking into account, as it was said (see note 2), the presence of errors in references by a number of authors, the data for sources [53, 5658] were specially refined.

(5) According to [80], MacMahon et al., 1960 [53] proposed criteria for improving the ecological (correlation) design of epidemiological studies, which in fact is suitable only for the formation of hypotheses [65, 8183] (and others, see also in [5, 8, 9]). The following reasoning dated 1960–1967 [53, 80] is probably of timeless value, and apparently doomed to constant repetition, as time and time again, century after century, the presence of simple territorial correlations is taken as evidence of causation, such as the correlation between COVID-19 disease severity and vitamin D levels across countries (see media). I also recall studies of the frequency of lung cancer depending on the level of radon by region (see [7577]). Thus: “Striking geographic differences in the prevalence of disease could be produced, however, because of the geographic distribution of individuals who have the char acteristics that are the true causal factors… Age, religion, race, occupation and social class are other important characteristics of the individual which also vary by geography. To deal with these complicating factors, MacMahon, Pugh and Ipsen, 1960 recommend five criteria which strengthen the conclusions that place is related to disease: (1) High frequency rates must be observed in all ethnic groups inhabiting the affected area. (2) High frequency rates will not be observed in persons of similar ethnic groups inhabiting other areas. Analysis by age, sex, social class, and personal and occupational characteristics should also show that the affected area has a higher rate for each of these sub-classes. (3) Healthy persons entering the area will become ill with a frequency similar to that of the indigenous inhabitants (4) Inhabitants who have left the area will not show high rates, and if they were affected at the time of emigration, they will show signs of improvement or recovery. (5) Species other than man inhabiting the same area will show similar manifestations” [80]. It is doubtful that the specified criteria of 1960 were observed in assessing the consequences of the accident at the Chernobyl nuclear power plant. (It was difficult to find the cited material from [80], and therefore it seemed appropriate to quote it in full.)

(6) There are paper versions of MacMahon et al. 1960–1996 [53, 56, 57] (1997 [58] not found) on the Internet. Electronic versions are presented only in the form of Google Books for 1970, which is not freely available in the form of pages and with a search each time for only three positions out of all possible. But the monograph of Michaelson and Lin, 1987 [84] (also a Google Book) contained an almost complete citation of three, and precisely three, as indicated there, causality criteria from MacMahon and Pugh, 1970 [56], although in a truncated form. The material from [84] made it possible to beat the Google Books robot for MacMahon and Pugh, 1970 [56], which eventually produced all the necessary fragments.

(7) A native of SAR, Susser fought against Germany for five years as part of the South African army (infantry, artillery, and aviation [50]), and after the Second World War he received a medical education and began his medical career in “black” places, where he saw links between health status and social injustice [9698]. This laid the foundations for his “social epidemiology.” Speaking against apartheid, he was an ally of the ANC leaders headed by N. Mandela [50, 9698] (there is a letter from 2001 from the 83-year-old N. Mandela to the 80-year-old Susser [98]). In 1955, he emigrated to the United Kingdom because of the repressions, together with his wife (also an epidemiologist and co-author, Zena Stein), where he conducted research at the Faculty of Social and Preventive Medicine of the University of Manchester [97, 98]. Formally, Susser had no epidemiological education [50] (“self-taught epidemiologist” [98]), but in 1965, having moved to the United States, he became head of the Department of Epidemiology at the School of Public Health at Columbia University in New York, where he remained until 1978 [50, 9698]. In 1977, he founded the Gertrude H. Sergievsky Center at Columbia University for the study of the development of neurological disorders [96, 97] (psychiatric epidemiology is the field of activity of his son, Ezra Susser [97]). M. Susser was the director of this Center until 1991 (see the website of this organization). Later, in the 1980s–1990s, M. Susser tried to fight the HIV/AIDS epidemic in the United States and South Africa. In 1992–1998, he was the editor-in-chief of the American Journal of Public Health [9698]. The main publications of M. Susser on the problems of causality in epidemiology were published in 1973–1991; the Causality Dictionary for Healthcare was published in 2001 [8588, 99102].

(8) A.M. Lilienfeld (Member of the US Academy of Sciences, Head of the Department of Epidemiology at Johns Hopkins University since 1970) is one of the authors of methodologies for assessing causality in the epidemiology of chronic diseases [2]. Unlike Susser, Lilienfeld (like MacMahon) had a master’s degree in epidemiology [50]. In [2, 69], we presented data on the consideration of as many as six criteria: “Strength of association,” “Consistency of association,” “Specificity,” “Biological plausibility,” and “Experiment” by this author in 1957–1959. This is all “before Hill” (see footnote above). Like Susser, Lilienfeld was a participant in World War II (US Army) [50]. It can be noted that the creators of the “epidemiology of the first generation” [61], Major Greenwood (1880–1949; medical service, then the British Army Ammunition Corps) [103] and A.B. Hill (UK pilot; see [2]), also were the veterans of war, but of the First World War. The Russian founder of the doctrine of causality in medicine, Ippolit Vasilyevich Davydovskii (1887–1968; the corresponding monograph dated to 1962 [104]), was a doctor in a rifle regiment during World War I and he was the chief pathologist of evacuation hospitals with trips to the front during World War II [105]. Another founder of the Causality Criteria Complex, Alfred Spring Evans (United States; see below), had ties to the army through his consulting work both at the end of World War II and during the Korean War. So, he was elected President of the US Armed Forces Consultants Society in 1983 [106].

(9) “…the epidemiologist’s responsibility to convert epidemiologic data into action” [108].

(10) The name of Susser is barely mentioned in the Russian epidemiological literature. Among all epidemiology manuals, this author is named only in [65] in relation to the subject “Environmental epidemiology.” At the same time, it is said that the term “Eco-Epidemiology” (or “Ecoepidemiology”) was proposed by M. Susser together with his son, E. Susser (see note 8) in 1996 [65]. This is not true; the term was used much earlier: when searching in PubMed, the first work with the term dates back to 1984. In addition, the surname “Susser” is transliterated on Russian by [65] as “Schusser.” This is hardly correct: the Google translator voice “Susser” sounds like “Sasse(r)” from English and Afrikaans, and like “Susse(r)” from Dutch. There was also a single Russian-language mention of M. Susser when quoting material from the translation of the epidemiological dictionary by J. Last (see [1]). Information in the Russian-language literature on the monograph on the topic of causality in medicine and epidemiology Susser, 1973 [85], was not found.

(11) “In addressing the evidence on smoking, the report listed and described (if not very adequately and without citing the literature) five criteria for judging causality in a given association…. This codification gave rise to two independent elaborations, one by Hill (ten) and the other by myself (85)” [87].

(12) “In ignorance of Hill’s paper, I developed my own discussion of causality in order to meet the burgeoning tasks set by the multivariate age of epidemiology then emerging” [87].

(13) Once again [1, 5, 6], it seems appropriate to quote the sophistication from the work of Greenland and Robins, 1986 [112] (“bivariant counterfactual” example (set) according to the quote in [107]), according to which for causality association is not required. The authors of [112] made the assumption that half of the individuals in the population are sensitive to some effect and can die from it, and the other half can die precisely because of the absence of such an effect (let us imagine a population where, say, half are heavy drug addicts, a dose of a potion which is lethal to an ordinary person). If exposure is randomly distributed over a population, then the expected mean causal effect will be zero: there will be no association between exposure and mortality in an infinitely large group. But the observed result for each individual will be causally due to the fact of exposure or nonexposure to [107, 112].

(14) “The probability that an association exists is the first criterion commonly deployed in causal inference in epidemiology” [101].

(15) “Susser’s set of causal criteria prioritizes three elements as sine qua non—association, time order, and direction—and then follows with five additional elements” [108].

(16) “Susser elevated three criteria to the status of absolute requirements: association, time order, and direction” [107].

(17) In [5], we gave an example from the textbook [116]. Suppose a hypothetical cross-sectional study shows that, on average, men have an increased incidence of high blood pressure compared to women. From this we can tentatively conclude that gender affects blood pressure, since the reverse assumption (that blood pressure determines gender) is implausible.

(18) “Predictive performance is defined deductively by the ability of a causal hypothesis drawn from an observed association to predict an unknown fact that is consequent on the initial association” [87].

(19) “Hill’s synthesis has remained the major reference for causal inference in epidemiology. His paper was not replaced by later attempts to enrich it (Susser, 1977) or supplanted by attempts to restrict causal inferences to observations obtained deductively rather than inductively (Buck, 1975; Rothman, 1988)” [119].

(20) Susser’s historical analysis argues against ossified causal criteria (“epidemiologists have modified their causal concepts as the nature of their tasks has changed”) [31].

(21) “In this area, Susser has worked essentially alone to lengthen the list of criteria for judging causality, to arrange the criteria into hierarchical categories, to distinguish their roles in affirming and refuting causality, to explore their interrelations, and to begin to quantify their contributions to causal judgments. As his system of causal criteria becomes more elaborate, however, it has raised questions pertaining to Kuhn’s distinction between the function of scientific criteria as values or as rules” [107]. Thomas Samuel Kuhn (T.S. Kuhn; 1922–1996), a well-known philosopher of science [123]: “When scientists must choose between competing theories, two men fully committed to the same list of criteria for choice may nevertheless reach different conclusions” (cited from [107]).

(22) “Susser’s discussion of causal criteria occupies only a brief 22 pages in the original text, but it helped spur a vigorous discussion of the use of such criteria, which has persisted unabated to the present day, including substantial refinements by Susser himself. “Susser’s elaboration and expansion of this list over the ensuing years [1977–1991] forms the most detailed and prolonged attempt to develop criteria for causality in the field of epidemiology” [107].

(23) Five strategies of evidence by Susser from his inaccessible monograph were reconstructed according to the manual of 1978 [124] (at present, this Google book is no longer found). Due to the rarity of the material, it is appropriate to summarize these data briefly, although there is nothing special about them. So, the strategies according to Susser from [85]: (1) refers to the design of the study; contains “simplification of conditions of observations”; (2) involves screening causal patterns to eliminate or at least measure extraneous variables; (3) consists of “development of relationships between variables” and includes statistical analysis; (4) uses the principles of probability to identify stronger causal relationships; (5) is subjective and based on causal criteria. (“The final strategy is judgmental and is based on the sequence, strength, consistency, specificity, and coherence of the associations. Susser concludes: “The process of causal analysis, central to all science, is most crucial where the subjects of study are least biddable…. Where landmines are everywhere, one should not venture out without a mine.”) [85] (cited from [124]).

(24) Evans, according to a publication on his 70th birthday [106], was a versatile specialist: “achieve-as infectious disease clinician, field epidemiologist and laboratory investigator, teacher and speaker, writer and editor, consultant and advisor, organizer and leader, medical historian and philosopher.” In 1939, Evans came to the University at Buffalo. After an internship in Pittsburgh and residency in internal medicine at a New York clinic, he joined the army in 1944: in Japan he was engaged in public health and epidemiology. Then he studied at Yale University, from which time for about 40 years he was engaged in infectious mononucleosis. He “reanimated” and became editor-in-chief of The Yale Journal of Biology and Medicine. He was assistant professor of medicine at Yale. During the Korean War, he served in the army as head of the hepatitis laboratory in Munich, where he conducted clinical and laboratory research, and also studied the transmission of infections in animal experiments. In 1952, he returned to the United States, becoming professor and chairman of the Faculty of Preventive Medicine and director of the hygiene laboratory. He was one of the first to use a computer program for a public health laboratory. Already being a specialist and leader, he continued his studies in the field of epidemiology and biostatistics. In 1966, Evans became director of the WHO Regional Reference Serum Bank. In 1982, he was appointed professor of epidemiology at Yale University (at the time of his death he was an honorary professor). In 1983, he was elected President of the US Armed Forces Advisory Society (consultant to the Army, Navy (nuclear submarines), and NASA). He had been a consultant to the US Surgeon General (Chief of Public Health) and President of the American College of Epidemiology and the American Epidemiological Society. He dealt with the issues of viral etiology of cancer (Epstein-Barr virus, etc.); he paid much attention to the history of the causality of infectious diseases (in particular, the history and modification of the Henle–Koch postulates; see [2]) [106, 126]. “He has been an effective messenger between infectious disease and chronic disease epidemiologists and biostatisticians” [106], on the basis of which he developed his universal postulates of causality.

(25) The presence of the postulates of Evans from 1976 [127] in the original English dictionary by J. Last (4th edition available from 2001) was checked.

(26) “Criteria for causation: a unified concept: (1) Prevalence of the disease should be significantly higher in those exposed to the putative cause than in cases of controls not so exposed. (2) Exposure to the putative cause should be present more commonly in those with the disease than in the controls without the disease when all risk factors are held constant. (3) The incidence of the disease should be significantly higher in those exposed to the putative cause than in those not so exposed as shown in prospective studies. (4) Temporally, the disease should follow exposure to the putative agent with a distribution of incubation periods on a bell-shaped curve. (5) A spectrum of host responses should follow exposure to the putative agent along a logical biological gradient from mild to severe. (6) A measurable host response following exposure to the putative cause should regularly appear in those lacking this before exposure (i.e., antibody, cancer cells) or should increase in magnitude if present before exposure; this pattern should not occur in persons so exposed. (7) Experimental reproduction of the disease should occur in higher incidence in animals or man appropriately exposed to the putative cause than in those not so exposed; this exposure may be deliberate in volunteers, experimentally induced in the laboratory, or demonstrated in a controlled regulation of natural exposure. (8) Elimination or modification of the putative cause or of the vector carrying it should decrease the incidence of the disease (control of polluted water or smoke or removal of the specific agent). (9) Prevention or modification of the host’s response on exposure to the putative cause should decrease or eliminate the disease (immunization, drug to lower cholesterol, specific lymphocyte transfer factor in cancer). (10) The whole thing should make biological and epidemiological sense” [127].

27. “Bert Black, a lawyer, and David Lilienfeld, an epidemiologist, have proposed another set of guidelines that need to be fulfilled for epidemiological proof in toxic tort litigation (Black and Lilienfeld, 1984). They term the guidelines the “Henle–Koch–Evans” postulates. They represent what I termed, with tongue-in-cheek, a “Unified Concept of Causation” (Evans, 1976)” [129].

(28) “Postulates of causation for occupational disease: (1) The prevalence of the disease should be higher in those exposed to the putative causes in an occupational setting than in those not so exposed either in the same setting or in other similar settings; if possible, this should be shown in matched controls. (2) Exposure to the putative cause should be clearly demonstrated by historical and/or laboratory data to have occurred more often in those with the disease than in those without the disease when all other factors arc held constant and be shown more likely than not to have caused the disease. (3) The risk of developing the disease should increase with the duration and intensity of exposure to the putative cause. (4) The incidence of the disease should be higher in those exposed to the putative cause than in those not so exposed as shown in prospective studies. (5) Temporally the disease should follow exposure to the putative cause in that workplace and both exposure and disease should be absent prior to starting work in that workplace. (6) Other causes of the same disease outside the workplace should be excluded or, if present, the attributable risk of each exposure assessed. (7) A biological gradient of response to the putative cause should regularly appear or should increase following exposure to the putative causes as shown by objective evidence. (8) Elimination or modification of the putative cause, or the vehicle carrying it, or protection of the worker against it, should decrease the incidence of the disease. (9) Experimental reproduction of the disease should be demonstrated, if possible, in susceptible animals or humans exposed accidentally or deliberately to the putative cause. (10) The relationship between cause and effect should be shown in several studies, make biological and epidemiological sense, and be consistent with the natural history of the disease” [129].

(29) “Guidelines for relating a putative virus to a human cancer. Epidemiological: (1) The geographic distribution of infection with the virus should be similar to that of the tumor with which it is associated when adjusted for the age of infection and the presence of cofactors known to be important in tumor development. (2) The presence of the viral marker (high antibody titers or antigenemia) should be higher in cases than in matched controls in the same geographic setting, as shown in case-control studies. (3) The viral marker should precede the tumor, and a significantly higher incidence of the tumor should follow in people with the marker than in those without it. (4) Prevention of infection with the virus (vaccination) or control of the host’s response to it (such as delaying the time of infection) should decrease the incidence of the tumor. Virologic: (1) The virus should be able to transform human cells in vitro into malignant ones. (2) The viral genome or DNA should be demonstrated in tumor cells and not in normal cells. (3) The virus should be able to induce the tumor in a susceptible experimental animal and neutralization of the virus prior to injection should prevent development of the tumor” [139].

(30) It is probably only about chronic diseases, and even then it is too categorical, because, for example, the “plague factor” will be harmful to both the individual and the population. How can it be “balanced”?

(31) “Epidemiological data are, therefore, difficult (possibly impossible) to apply in legal cases about individuals. To quote Evans discussing the issue in the United States of America: “Legal requirements are concerned with the risk in the individual, the plaintiff, and whether the preponderance of evidence supports the conclusion that that exposure ‘more likely than not’ resulted in that illness or injury in that person” (1978, p. 194). Evans contests that a higher order of proof and specificity is required in legal proof than in epidemiological proof, concluding that epidemiological evidence is often inapplicable in this context. Epidemiology is a science based on studies of groups and cannot be directly applicable to individuals, and this is an inherent limitation. Equally, a factor demonstrated to cause a disease in an individual, by a science of individuals, say toxicology or pathology, may not be demonstrable as harmful in the population, possibly because harmful effects are balanced by beneficial ones. This is an inherent limitation of the science of individuals. The problem lies not with epidemiology itself, but with those who apply epidemiology in these circumstances. The law also extrapolates from population data to the individual. The standard of proof in epidemiology is not of a lower order than in law, but it is of a different order and for a different purpose. The problem is that so often the best we can offer the individual is average risk derived from the study of groups similar to that individual. That is a limitation of medical sciences collectively. We now consider how epidemiological guidelines for causality help to analyze the causal basis of associations observed at the population level” [131].

(32) In a video from 1996 [153], Philip Cole talks about his prediction of a decline in cancer mortality in the United States due to a decrease in the frequency of smoking in the population. These data, co-authored with B. Rodu, were published at the same time, in 1996 [154] and developed (B. Rodu and P. Cole) in 2001 [155]. There has been a steady decline in cancer deaths in the US since a peak in 1991, to which there has been an increase. The authors attributed the trend primarily to smoking cessation and then to improved medical care [154]. Another study [155] noted an increase in cancer mortality from 1950 to 1990 with a concomitant decrease in overall mortality. However, mortality from all forms of cancer, excluding lung cancer, declined continuously from 1950 to 1998, falling by 25% over that period. The predictive extrapolation of P. Cole and B. Rodu from 1996 [154] was confirmed in 2016 when B. Rodu wrote an indignant letter to the Cancer Society USA, asking why this organization “embrace key elements of Dr. Cole’s landmark work, but defy the norm of formally citing the source?” [156]. The importance of this note, although not directly related to the topic of our message, is that there has been a steady decrease in the incidence (for men and stabilization for women) and mortality (both sexes) from cancer (etc.) in the United States since 1991 [157], which, apparently, is not widely reported in the Russian media. According to the “Kommersant” newspaper, the chief oncologist of Russia stated the following in an interview in 2018: “… in 2017 it became even more clear that the incidence of cancer around the world is growing” [158] (however, data on a decrease in the incidence of cancer deaths in the United States appeared later in the same newspaper [159]). According to Rosstat, there is no increase in the incidence of cancer deaths per 100 000 population in Russia, but there is a rather downward trend: by 0.8% in 2016 compared to 2015, by 2.4% in 2017 compared to 2016, by 0.1% in 2018 compared to 2017, by 4% in 2020 (January–March) compared to 2019. There was an increase of 1.2% only in 2019 compared to 2018 (https://rosstat.gov.ru/free_doc/2020/ demo/ edn02-20.htm). Our analysis of official data from the United States [157] (Fig. 7; and other sources) and Russia (Rosstat) showed that the “extraordinary increase” in the frequency of oncology announced in Russia from year to year, which led to the campaign “to fight” it, in terms of cancer mortality rate per 100 000 population has just reached the US value (about 200 per year), which, according to US statistics, is the result of a steady decline since 1991. Of course, diagnostics and other factors play a role, but still.

(33) For example, radon on miners [168].

(34) “It takes very little in the way of ill-founded testimony by a physician to support a jury’s belief, despite the lack of any scientific validity for that belief” [169].

(35) “Can: Can the agent in question produce the disease at issue? (1) Is there substantial and properly relevant animal data? (2) Is there human evidence, particularly epidemiological support? Did: Did it cause it in this case? (1) Have other causes been properly considered and ruled out? (2) Has the exposure been confirmed? (3) Was the exposure sufficient in duration and concentration? (4) Was the clinical pattern appropriate? (5) Is the morphological pattern appropriate? (6) Is the temporal relationship appropriate? (7) Is the latency appropriate?” [169].

(36) Weed graduated from the University of Ohio with a degree in engineering in 1974 and in medicine in 1977. He received his Master of Public Health (M.P.H.) and PhD in epidemiology from the University of North Carolina in 1980 and 1982, respectively. Research: causality in epidemiology and carcinogenesis, methods for evaluating and weighing evidence for effects, ethics of epidemiology and public health. In 1982–2007, he worked at the National Cancer Institute (head of the department of preventive oncology, etc.). He was a co-chairman of the US Academy of Sciences committee on the “Daubert rule” (the Daubert rule in courts; discussed by us in [3, 8]), as well as an expert at the Federal Judicial Center in Washington. He chaired the Committee on Ethics and Standards of Practice of the American College of Epidemiology. He is an adjunct professor at the universities in Utah and New Mexico [171, 172]. He is currently an independent scientific consultant, founder and CEO of DLW Consulting Services, LLC, providing expert advice on the causes of pathology, methods of causation, methods of evidence, epidemiological and clinical research, and ethical standards in epidemiology and public health. The company specializes in providing expert advice and recommendations on issues at the intersection of healthcare, law, commerce, and public policy [172].

(37) We possess, by all indications, an almost complete selection of the original publications on “Popperian Epidemiology.”

(38) Constructions of inductivists, in particular, W.C. Salmon [178], are given in [37]. In their view, no theory can be established with certainty on the basis of a series of supporting examples, but the probability of the truth of a theory increases, as said, as the evidence in its favor accumulates. According to this position, researchers “begin with probable hypotheses and find further support for them through positive confirmation, thereby making them increasingly more probable.” Popper, however, pointed out the difficulties in assigning probability to scientific hypotheses, which corresponds to the “degree of belief in the hypothesis,” as well as the “subjective probability is internally inconsistent” (see [37]). In our opinion, it is clear that such objections are valid in 99% of the situations of our everyday life (that is, it is almost impossible to do anything, because the reality of actions has not been confirmed “truly scientifically”), and, therefore, they can hardly be of any importance for such a practical science like epidemiology. Even the very approach of Popper to put forward hypotheses, nevertheless, must be based on likelihood and probability (“faith”), otherwise hypotheses based, for example, on astronomy and astrology can be taken a priori equal, and to refute them on an equal footing, which, clearly, makes the practice of scientific research impossible due to the infinity of the number of hypotheses.

(39) In epidemiology, causality is mostly discussed through the use of certain criteria of causality, originally developed by Hill [10]. Some of these criteria are linked to the design of the studies and can be judged based on quantified information (temporality, the strength of the association, dose-response, and specificity of exposures or outcomes). Other criteria are externally related (consistency with other studies, prediction, relation to health statistics, lack of alternative explanations, and analogy) [200]. “The major causal criteria of temporality, biological plausibility, consistency, and lack of alternative explanations are well supported” [201].

(40) Cox, 2018 [265], like Rothman and Greenland [1622] (see also [6]), analyzes the criteria separately and finds sometimes surprising exceptions in terms of ingenuity for each. Moreover, L. Cox, relying on J.P. Ioannidis, the extraordinary theorist of epidemiology and evidence-based medicine [266], went even further. As we discussed earlier [3, 8], the “Strength of association” criterion indicates a high probability of true association for all epidemiologists in terms of a large RR value, since confounders and biases, as a rule, do not make a significant contribution (this was also pointed out by J. Cornfield [3, 8, 191]). But high risk immediately arouses suspicion for L. Cox [265] and J.P. Ioannidis [266] “A strong exposure–response association may—and perhaps usually does (Ioannidis 2016)—simply reflect strong sampling or selection biases, strong modeling errors and biases, strong coincidental historical trends, strong confounding, strong model specification errors, or other strong threats to internal validity” [265]. The reasoning is reminiscent of an old joke: “Hello, dear, I have heard so much about you!—You never know what people say—but try to prove it!” Of course, R. Doll pointed out back in 1996 that all strong associations for epidemiology had already been discovered in the past decades, and our time is a time of weak associations [267].