Introduction

This chapter reviews recent epidemiological data on the relative contributions of genetic and environmental risk factors for the development of RA. It considers and proposes the direct and indirect evidence to the contribution of various risk factors for disease susceptibility. The quality of the evidence varies and, where appropriate, this is highlighted.

Genetic factors

Descriptive epidemiology of RA

The descriptive epidemiology of RA is suggestive of a genetic effect. The occurrence of RA is relatively constant with a prevalence of between 0.5 and 1.0%, a frequency that has been reported from several European [18] and North-American populations [9, 10]. However, there are some interesting exceptions (Fig. 1).

Figure 1
figure 1

Prevalence of rheumatoid arthritis in various populations. Data from [14, 69, 11, 12, 16, 17].

Specifically, native American-Indian populations have the highest recorded occurrence of RA, with a prevalence of 5.3% noted for the Pima Indians [11] and of 6.8% for the Chippewa Indians [12]. By contrast, there are a number of groups with a very low occurrence. Studies in rural African populations, both in South Africa [13] and in Nigeria [14], failed to find any RA cases in studies of 500 and 2000 adults, respectively. Studies in populations from Southeast Asia [15], including China and Japan [16, 17], have similarly shown very low occurrences (0.2–0.3%).

'Migrant' studies

It is clearly difficult from a review of the descriptive epidemiological data to know whether environmental or genetic effects explain the differences between countries. One handle on this is to consider the occurrence in populations presumed to be of the same genetic origin but living in different environments. Such a situation arises by studying populations that have moved from one environment to another.

There are a few studies addressing this with respect to RA. A low occurrence of disease was found in one study of a Caribbean population of African origin living in Manchester, UK, suggesting that the protection to this group was indeed genetically determined [18]. Similarly, the investigation of a Chinese population living in an urban environment in Hong Kong showed the same consistent low frequency [19]. Recent data have shown, however, that Pakistanis living in England had a higher prevalence than those in Pakistan, but it is not as high as the prevalence in ethnic English populations [20]. In general, the data on the geographical occurrence of RA would support the existence of genetic factors being important and explaining differences in disease risk.

Familial clustering

The next stage in accruing evidence for genetic risk is to document an increased occurrence of disease in relatives of probands compared with the background population prevalence, so-called familial recurrence risk. Studies of hospital attendees are subject to bias as there may be a selection process whereby individuals would more probably be referred to hospital if they have an affected family member. Furthermore, several studies rely on family history as elicited by the proband, which again is subject to bias.

Few studies have been performed comparing familial recurrence risk in relatives of cases derived from population samples with those of controls. Indeed, such studies have only shown a modest increased risk [21, 22]. For example, a study from the Norfolk Arthritis Register in England showed only a twofold increased risk [23]. Such an observation does not negate the role of genetic factors, but underscores that their contribution to explain disease susceptibility may be modest. This is important as the familial recurrence risk is a key factor in determining the power of genetic linkage studies within affected family pairs. Indeed, in contrast to other autoimmune diseases such as insulin-dependent diabetes and multiple sclerosis, the familial recurrence risk in RA is certainly smaller, thereby making it harder for studies to identify new genetic factors.

Twin studies

A variant of studies of familial recurrence risk is the comparison of disease risk in the initially unaffected co-twin of monozygotic probands compared with dizygotic probands. The assumption is that the environmental sharing between these different twin pair types is the same and thus any increased disease concordance in the monozygotic twins confirms the genetic effect. It is important in such studies to ensure that it is only like-sexed dizygotic twin pairs that are compared with the monozygotic twin pairs. However, there may be a greater environmental concordance in monozygotic twins due, for example, to psychological and other factors.

Twin studies have consistently showed a fourfold increased concordance in monozygotic twins compared with dizygotic twins [24]. This increased risk, however, is of little value in attempting to quantify the genetic contribution to disease risk. The concordance between twins is dependent on the prevalence of disease. As the population prevalence approaches 100%, the concordance will increase accordingly, independent of the true genetic effect. The appropriate way of quantifying genetic risk is to assess the heritability based on a series of assumptions of environmental sharing and genetic sharing between twin types. Such a study has recently been attempted using data from both Finnish and English twins [24]. The results suggest that approximately 50–60% of the occurrence of disease in the twins is explained by shared genetic effects.

Genetic susceptibility factors: human leukocyte antigen

The role of HLA DRB1 alleles as a risk factor of RA has been known for 25 years. Associations between different HLA DRB1 alleles have been demonstrated in several populations across the world [2531]. Indeed, there have been few populations where associations have not been demonstrated.

Interestingly, there do appear to be differences in the strength of association between different alleles. For example, HLA DRB1*0404 is a much stronger susceptibility factor than HLA DRB1*0101 [32] (Table 1). Although it has been suggested that the susceptibility alleles all share a single epitope [33], it is difficult to explain the variable risk under this model. Furthermore, the risk of disease is related not only to the presence of one single allele, but also to the full HLA DRB1 genotype [34]. Individuals who carry the so-called 'compound heterozygote' genotype HLA DRB*0401/*0404 thus have a substantially greater risk than, for example, individuals who carry single HLA DRB*0101 alleles.

Table 1 Phenotype frequencies of HLA DRB1

There is some suggestion that the relationship between human leukocyte antigen (HLA) and RA may be more related to the severity of disease, and that the development of arthritis per se is only weakly related. Support for this comes from studies from the Norfolk Arthritis Register population-based study of inflammatory joint disease in England [32]. Data from this study show only a weak relationship between susceptibility to the disease and the HLA DRB1 genotype (Table 1). The association is fairly strong in those individuals who satisfy the criteria for RA. The data show there is an influence of genotype, with some genotypes having a stronger association as shown.

The HLA region on the short arm of chromosome 6 is a gene-rich area including several candidate genes that have an influence on the immune process. One of the most highly investigated is tumour necrosis factor (TNF). Studies have shown associations between TNF alleles and RA [35, 36], although one explanation may be linkage disequilibrium with HLA DRB1. Studies have also suggested, however, that the associations between HLA and TNF-c1 and TNF-b3 are independent of associations between HLA and the shared epitope [37]. Other studies have shown an extended haplotype stretching from HLA through to TNF that has been implicated in disease [38].

Genetic susceptibility factors: non-MHC genes

Data from twin studies in the HLA association and sharing studies have been used to estimate that only 50% of the genetic contribution to RA can be explained by HLA [24]. This has sparked a search for non-MHC genes.

The largest effort has been expanded in some whole genome screens on affected sibling pair families. Four such screens have now been undertaken in Europe [39], the United States [33], Japan [40] and the United Kingdom [41]. A number of markers emerge from these studies suggestive of a linkage with RA, although the linkage with HLA is by far the strongest. One problem is that such studies often have only a weak power to detect defects. By contrast, because such studies may be simultaneously testing the possibility that any one of 200 regions may be linked with disease, the likelihood of a false-positive result is also very high. It is for this reason that it is not surprising studies often fail to replicate results both between themselves and on further samples within the population. It is therefore necessary to undertake further validation studies and more in-depth investigations, using more closely spaced markers.

An alternative approach is to use a candidate gene screen where there is no prior reason for looking at a particular region. Such an approach has been productive, and evidence has shown that corticotrophin releasing hormone [42], CYP19 (oestrogen synthase) [43], IFN-γ [4446] and other cytokines [4749] are linked to RA. Other approaches have addressed the possibility that genetic regions linked to other autoimmune diseases, such as insulin-dependent diabetes [50], may also be linked to RA. Indeed, linkage to a locus on chromosome X was shown in one study [51]. A further strategy has been to use results and genome screens on animal models of arthritis to see whether syntenic regions are also linked to RA in humans. Such studies have suggested linkage to 17q22 [52].

Whether any of these positive findings discussed will result in the identification of a true disease susceptibility mutation remains to be seen. However, one clear problem is that RA itself is probably heterogeneous and studies that fail to take notice of this heterogeneity may make it possible to find a positive result.

Environmental factors

The term 'environment' is frequently used to describe all those susceptibility factors leading to disease that are not explicable on the basis of an identifiable genetic marker. In a strict sense, however, environment could be taken to refer to those factors external to the individual; for example, factors associated with diet, water or air-borne exposures. It is also important to consider factors implicated with diseases that are internal to the subject without an obvious genetic basis. An appropriate term for this group of factors is 'nongenetic host factors'.

Nongenetic host factors: hormonal and pregnancy factors

The increased risk of RA in females has lead to considerable effort in examining the role of hormonal and pregnancy factors in disease occurrence. In general, male sex hormones, particularly testosterone, are lower in men who have RA [25]. By contrast, levels of female sex hormones are not different between RA cases and controls [53].

Interestingly, exogenous hormonal influences are implicated in disease risk. The most widely studied of these is exposure to the oral contraceptive pill, based on an observation made over 20 years ago [54] (Fig. 2). There have been several studies [55] confirming that women who take the oral contraceptive pill are at reduced risk of developing RA [25]. There is no clear explanation for this and the association exists despite the formulation of the oral contraceptive pill varying enormously both between populations and over time. A follow-up of the original study was undertaken that suggested the oral contraceptive pill was protective. This showed that the initial protection was lost on follow-up [56]. One conclusion might therefore be that oral contraceptive use may postpone, rather than totally protect against, the development of RA.

Figure 2
figure 2

The incidence of rheumatoid arthritis in relation to use of the oral contraceptive pill (OC). Data from the Royal College of General Practioners' oral contraception study [54].

Pregnancy itself has been investigated as a risk factor in RA development. Studies on the influence of pregnancy on RA have produced conflicting results. A number of studies [25] have suggested that women who are nulliparous are at increased risk of developing the disease, although there is no increased risk in women who are single [57]. It would thus appear that subfertility highlights a group at higher risk.

Recent studies have suggested that pregnancy might also be important with the interesting observation that the postpartum period, particularly after the first pregnancy, represents a strong risk period of disease development [58, 59] (Fig. 3). Subsequent investigations showed that much of this increased risk could be explained by exposure to breastfeeding and it is women who breastfeed after their first pregnancy who are at the greatest risk [60]. The suggestion then arose linking breastfeeding with sub-fertility in so far as RA may be related to either increased prolactin or abnormal response to prolactin, this latter hormone being proinflammatory [61].

Figure 3
figure 3

Increased risk of rheumatoid arthritis onset in the postpartum period. * Relative to nonpregnant periods. Data from [59].

Nongenetic host factors: other

There have been a number of studies looking at other comorbidities that have an increased frequency in both subjects with RA and in their families. The most widely investigated has been the occurrence of other autoimmune diseases, particularly type 1 or insulin-dependent diabetes and autoimmune thyroid disease [62]. Other diseases, for example schizophrenia, have been shown to be negatively associated with RA development [6365]. The significance of these findings is unclear.

There have been relatively few studies on anthropometric factors associated with RA, although one recent case–control study suggested that people who were obese were at higher risk [66]. The reason for this is unclear, and it is not certain whether this may represent a confounding factor of another exposure or whether people who are obese have, for example, increased production of oestrogens, which might pose a risk. A more recent case–control study found, however, after adjusting for age, smoking and marital status, that a link with obesity was nonsignificant [67].

Environmental factors: infection

Indirect evidence

There is much indirect evidence suggesting that exposure to infectious agents may be the trigger for RA. First, epidemiological data come from the observation of a decline in the incidence of RA in several populations [9, 16]. Many studies have indeed shown a halving in incidence over the past 30 years [68]. Given the genetically stable population, the most probable explanation is that of a decline in an infectious trigger. This effect of time on occurrence might also be related to the period of birth as well as to the current year of observations. The Pima Indians, for example, showed a decline in occurrence of disease, and an indepth study based on analysis of birth cohorts has shown a decline in the population occurrence of rheumatoid factor with increasingly recent birth cohorts [69].

There have been a few studies looking at clustering of RA in time and space, although there have been reports of nonrandom clusters occurring within the Norfolk Arthritis Register population [70]. Other indirect evidence regarding the role of an infectious agent has arisen from case–control studies suggesting that people who have had a blood transfusion, even some years prior to disease onset, may be at an increased risk of disease [66]. Recent practice has been to screen blood for a number of agents such as hepatitis, but the increased reporting of blood transfusion in older cohorts may indeed be explained by the increased likelihood of infection.

Possible infectious agents

There have been a large number of infectious agents that have been implicated in RA, including Epstein–Barr virus and parvovirus, as well as other agents, including bacteria such as Proteus and Mycoplasma. The epidemiological studies supporting or refuting these possible links are reviewed elsewhere [25] but, in general, such studies have been disappointing. One problem for the epidemiologist is that if RA represents the final common pathway of exposure to one of several different potential susceptibility organisms, many of which are also frequently observed in the general (i.e. nonarthritic population), it makes it more difficult to confirm a relationship with epidemiological studies.

Noninfectious environmental factors

There have been remarkably few studies on factors such as diet, although there is a theoretical basis for investigating the role of omega-3 fatty acids [71, 72]. Randomised trials suggest that diets high in eicosapentaenoic acid have a favourable effect on the outcome of RA [7375]. This might be because such fatty acids compete with arachidonic acids, the latter of which are involved in inflammation. Whether such dietary factors have a role in RA onset is much less clear.

It is perhaps surprising, given how much this exposure has been investigated in other chronic diseases, that very little attention has been given to cigarette smoking until recently. However, findings from a number of recent studies showed that cigarette smoking is associated with an increased risk of RA [66, 67, 7679] (Table 2). Studies have also suggested that smoking is related to development of rheumatoid factor independent of RA. Indeed, in many of the epidemiological studies showing a relationship between smoking and RA, the positive findings have been restricted to those with rheumatoid factor.

Table 2 Summary of recent epidemiological studies showing the association between rheumatoid arthritis and cigarette smoking

Future prospects

There has been considerable recent interest in understanding the epidemiology of RA. There have been several population studies in many different countries around the world, and observations of differential occurrence (with time, between populations and between the genders) has stimulated a number of analytical studies looking for both genetic and environmental risk factors. Future studies will benefit from advances in molecular biology techniques to aid with the identification and characterisation of potential new genes for RA susceptibility. These studies, as already described, have revealed some tantalising clues that will require further follow-up in years to come.

Concluding remarks

RA presents an epidemiological challenge and further elucidation of both genetic and environmental factors, together with interactions between them, are likely to be revealed.

Organisations supplying funds for research

Funds can be obtained from the Arthritis Research Campaign http://www.arc.org.uk and the Arthritis Foundation http://www.arthritis.org.