Rheumatoid arthritis (RA) affects up to 1% of the world population. It is a disease with a clear gender bias: women are affected 2.5-fold as often as men [1]. RA encompasses a number of disease subtypes such as Felty's syndrome, seronegative RA, 'classical' RA, progressive and/or relapsing RA, and RA with vasculitis. These subtypes have a pronounced variation in clinical symptoms, such as age of onset, disease course and outcome. Owing to this large variability in disease, diagnosis is made by the fulfilment of four criteria out of seven, as defined by the American College of Rheumatology (ACR) [2].

Most of these criteria are clinically defined signs and symptoms that could be the outcome of many different pathogenic pathways [3]; RA can therefore even be considered to be a syndrome composed of several distinct diseases. This makes frequency estimates and disease characteristics rather variable.

One of the first known appropriate clinical descriptions of RA appears in written records from 1782 in an educational textbook written by the Icelandic physician Jon Petursson [4]. However, there is evidence from skeletons from a human population living in the Mississippi valley about 5000 years ago that RA might have occurred before modern time [5]. It was not until 1978 that the first genetic association was reported [6], when a linkage to B cell allotypes [later known to be encoded by the MHC human leucocyte antigen (HLA)-D locus] was observed. It is now clear that the genetic predisposition to RA is polygenic and complex, and new techniques have to be employed to identify the underlying genetic factors.

In the 1990s, techniques became widely available that gave us the possibility of locating unknown genetic factors without the constraints of presumptions. In general, to use linkage analysis to study the genetics of a disorder, one ultimately investigates a skewing in the frequencies of alleles between healthy and affected subjects, and correlates that with a disease phenotype. At present a large effort is being made by the scientific community to determine the genetic influence on diseases; many techniques are being employed today.

Genes and environment

An individual develops RA as the result of a combination of genetic factors and factors in the environment. In addition, variation in the environment might affect not only the overall frequency of disease but also its phenotypic appearance. The difficulties in obtaining fully informative pedigrees in such a common disease as RA might reflect a very complex genetic influence, with many contributing genes of low penetrance. A poor definition of disease subtypes, and the influence of environment on the disease, also complicate finding the most important genes associated with the pathogenic events leading to RA [7]. However, the concordance of monozygotic twins (the frequency of monozygotic twin pairs in which both twins are affected) of 12–15% in comparison with the concordance of dizygotic twins of 2–4% [7,8,9] provides evidence of a genetic contribution. In a similar way, there is well established genetic influence on the development of other autoimmune diseases [10]. Presumably, none of the genes involved are either necessary or sufficient for the expression of the diseases, but contribute to the disease liability. The features of RA described (summarized in Table 1) are shared with several of the multifactorial disorders common in the human population, such as cancers, cardiovascular, psychiatric and autoimmune disorders.

Table 1 Characteristics of multifactorial disorders

Relative risk

The possibility of identifying susceptibility genes for a disease is greatly dependent on the degree of genetic contribution to the disease over other influences. A commonly used method for sampling the strength of the genetic factors involved is to estimate the relative risk of a sibling to a proband (λs). Put simply, this is calculated as the risk for a person with an affected sibling divided by the risk in the general population [11]. However, it has proved to be problematic to perform these calculations for RA, and also for most multifactorial diseases, because this estimate depends on the accuracy of the sibling risk and the population prevalence. These frequencies are dependent on the clinical definition of the disease.

As mentioned previously, RA is most probably not one distinct disease but a clinical syndrome including several diseases with heterogeneous etiologies. It is important to take into account the fact that the clinical description of RA has differed historically and geographically. Assessing the true frequency in families might also be difficult because family members are at different phases in their disease, with some in a very active state and some in complete remission. In addition, the differing ages of onset of the disease tend to complicate estimations of sibling risk; assessing the population prevalence has also been difficult because of inconsistencies in the time allowed for satisfying the ACR criteria. A study in the UK, in which patients satisfying a few of the four required ACR criteria were allowed 5 years to fulfil the criteria, showed that a much shorter period increased the risk of underestimating the population prevalence [9]. The λs value therefore varies markedly between different studies. These aspects are not always taken into consideration when λs values are reported and discussed; therefore, these λs values might not be of much value when estimating the level of genetic contribution to the disease or the contribution of separate genetic risk factors.


An alternative measurement of the relative genetic contribution to development of the disease is its heritability. Heritability (in the narrow sense) is the proportion of the variance in the disease liability that is explained by the inherited genetic variance [12]. This estimate is not as dependent as λs on disease prevalence. In a recent study, the heritability of RA was estimated to be about 60%, indicating that genetic factors account for a large proportion of the population's liability to the disease [13]. However, the use of twins in this type of study, as done by MacGregor et al, tends to overestimate heritability owing to the common environment shared between both monozygotic and dizygotic twin pairs [12]; this can be overcome by investigating twins reared apart.

Study design

Analytical methods

In efforts to identify genes causing monogenic Mendelian diseases and, more recently, the multifactorial common diseases such as RA, two main analytical strategies are used by investigators: association study with unrelated cases and controls, and linkage analysis using families with multiple affected members. The association study approach has been used extensively for candidate genes (genes suggested to be involved in disease development on the basis of the disease mechanism). However, for most diseases or syndromes like RA it is not an easy task to pinpoint candidate genes because of the complexity of the disease mechanism, and the candidate gene studies that have been performed largely report weak and inconsistent results that are below significance from a genome-wide perspective [8,14,15,16].

An alternative approach has been to perform genome-wide linkage screens, searching for genes involved in the disease development without any a priori assumptions about their chromosomal location or function in the pathogenesis of the disease. In a traditional linkage study, the segregation of the disease phenotype and polymorphic genetic markers are studied in families with several generations of affected individuals, to identify markers that segregate with the disease by using a parametric, or model-based, linkage analysis. Model-based methods require the estimation of the mode of inheritance for the disease, defined by disease allele frequency and penetrance for each genotype [17]. However, because most multifactorial diseases do not segregate in families as typical Mendelian diseases, the use of non-parametric, or model-free, methods [18,19,20] is being preferred in many studies. Most model-free methods estimate the degree of sharing of marker alleles that are identical by descent between affected sib-pairs. Although the model-free methods do not explicitly specify any disease inheritance model, the performance of the analysis is dependent on the underlying assumptions of the test [21,22]. It has been shown that the use of model-free methods is in most cases associated with loss of power compared with model-based methods, in spite of the lack of correct inheritance models [23,24].

The use of association studies has been proposed for genome-wide gene mapping of multifactorial diseases [25]. New technology permits the identification and large-scale analysis of the next generation of genetic markers, the single-nucleotide polymorphism (SNP) markers. SNPs have lower heterozygosity than microsatellites and are therefore less informative, but the abundance of SNPs in the genome allow much denser maps [26] to be constructed. How dense the map needs to be for mapping disease genes depends on the extent of linkage disequilibrium surrounding the genes, which depends on the age of the disease alleles, the age of the SNP markers and the rate of expansion of the population. The distribution of linkage disequilibrium most probably has great stochastic variation in the genome. In the continuing debate on this issue, the number of SNPs to scan the genome have varied from as few as 30,000 [27] through 500,000 [28] to as many as 1,000,000 [29], which might still yield one or only a few SNPs per gene. The debate continues [30,31].

It should be noted, when discussing the different strategies of association and linkage studies, that association mapping is most powerful when the affected individuals have inherited the same disease allele that is identical by descent from a common ancestor; this will be true if they are distantly related. Consequently, the association analysis will be a linkage analysis of a giant pedigree of unknown structure [22]. In a family-based linkage analysis, the meiosis available in the families will be investigated, whereas in an association analysis the number of meioses separating two 'unrelated' individuals will depend on the number of generations since they shared a common ancestor.

One of the great obstacles in the genetic analysis of multifactorial diseases is extended genetic heterogeneity. The locus heterogeneity will reduce the power of both linkage studies and association studies. However, linkage strategies will not be affected by allelic heterogeneity, whereas this is a major determinant of success for the association approach. Recently, investigations of the extent of linkage disequilibrium in the lipoprotein lipase gene [32] and the apolipoprotein E gene [33] showed that in either of these cases the currently known risk factors for cardiovascular disease and Alzheimer's disease, respectively, would have been identified in an association approach with the marker density proposed by the advocates of this approach [25,28,34].


The crucial outcome of both association studies and linkage studies, regardless of the statistical methods used, is the clinical definition of the disease. The power of any study design will be severely affected if the diseased individuals are ascertained on the basis of ambiguous phenotypes. Our ability to map disease genes is largely a function of the ability of the phenotype under study to predict the underlying risk genotype [35]. The importance of study design, including a careful ascertainment of the study material and thorough clinical evaluations, is therefore likely to be the key to success when mapping susceptibility genes for multifactorial diseases [22]. The use of ethnically isolated or recently founded human populations, which could be more homogenous in disease associations, has been suggested [36]. This strategy has been applied for systemic lupus erythematosus in the Icelandic population [37] and in the Finnish population for complex diseases such as hypertension [38] and multiple sclerosis [39].

Because disease development is not due solely to genetic factors, ascertainment of the study material is also important from an environmental point of view. Great heterogeneity in the environment among analysed patients might also complicate the finding of genetic components. The larger the environmental variance is, the more it will hide the genetic effects of the disease, decreasing the power to detect the genetic risk factors [22]. Therefore, controlling the environmental conditions, for example by ascertaining families or patients from the same geographical area or with the same type of life style, might be one way of increasing the possibilities of finding genetic risk factors. In humans, disease-susceptibility genes for certain multifactorial diseases might be so numerous, and their interaction so varied, that there might be a unique profile of disease alleles for each population. As a result, the identification of disease genes in one population might be difficult to replicate in another population, which is often required for the linkage results to be accepted as true.

Animal models

A powerful approach to localizing susceptibility genes for RA is to use inbred animal models. Most of the obstacles discussed above (Table 1) can be overcome to a greater or lesser degree, and the biological role of the genetic control can be addressed experimentally. This approach requires models in which the arthritis is caused by similar pathogenic pathways to those in human RA.

To achieve the goal of defining causative genes, it is likely that the joint use of model studies and patient materials will provide the most effective solution, as proposed in Fig. 1.

Figure 1
figure 1

Strategy to find genes of importance for rheumatoid arthritis.

Advantages and disadvantages

The advantage in all of the animal models is that the development of disease can be monitored carefully, the genetic content can be controlled and manipulated, and environmental influences can be kept to a minimum. At present there is a growing interest in well-defined animal models, because this branch of RA research generates significant information on linked genomic regions and also provides a tool for the further mapping and eventual identification and study of the underlying genes. However, there are two possible drawbacks to animal models: first, the genetic distance to humans, and second, the possibility that humans might use pathogenic pathways that do not exist in the experimental animals.

Animal models for RA

The first relevant antigen-specific animal model for RA to be established was the collagen-induced arthritis (CIA) model, in which collagen type II (CII) was injected into rats and induced an RA-like disease [40]. Since then, several other proteins have been shown to be able to induce arthritis (in both rats and mice) such as collagen type XI (CXI-induced arthritis, CXIIA) [41,42], proteoglycan (proteoglycan-induced arthritis) [43] and cartilage oligomeric matrix protein (COMP-induced arthritis) [44]. Other studies have shown that it is not necessary in all systems to administer a protein antigen; instead, a non-immunogenic adjuvant is enough to trigger the immune system to allow an inflammatory attack on the joints [45,46,47,48]. These models have been extensively studied to identify the genetic causes (see below), and it is probable that several models will have to be genetically dissected to allow us to understand the pathogenic pathways leading to RA.

The factors investigated

Several genes have been implicated in RA but it has not been easy to pinpoint the direct involvement of any particular candidate. Among the genes investigated are those encoding immunoglobulins, T cell receptor (TCR), cytokines and the MHC. A large fraction of the significant information has been derived from animal models.

B cells

B cells are likely to be important in the pathogenesis of RA. The occurrence of rheumatoid factors is one of the ACR criteria, and antibodies against CII are produced in the joints by a subset of RA-affected individuals [49,50,51]. In the CIA model, anti-CII antibodies are clearly pathogenic and antibody-mediated pathways are important in the process leading to arthritis [52,53,54]. Mice with impaired B cell functions, or lacking B cells, do not develop arthritis [55,56,57]. It is to be expected that many gene regions associated with CIA, and possibly RA, will control B-cell function and pathogenicity. In fact, the identified locus on chromosome 2 in the mouse is most probably caused by a deletion of the complement C5 gene [58].

A crucial role for the Fcγ receptors in triggering autoimmune arthritis has been suggested on the basis of the observations that mice lacking the FcRγ chain are protected from CIA, in contrast to wild-type mice, although both groups produced similar levels of IgG antibodies against collagen [59]. In addition, mice lacking FcγRII developed an augmented IgG anti-collagen response and arthritis collagen [59]. In a TCR-transgenic mouse strain, which spontaneously develops a joint disorder greatly resembling human RA, Matsumoto et al showed that the pathology was driven almost entirely by immunoglobulins [60]. In this particular strain, the target of both the initiating T-cells and the pathogenic immunoglobulins was the glycolytic enzyme glucose-6-phosphate isomerase [60].

T cells

A role for T cells in arthritis was implicated by the discovery of activated T cells in the joints; however, their role has since been debated. Circumstantial evidence strongly favours a central role for T cells in RA on the basis of MHC class II association, T-cell-dependent somatic mutations found in rheumatoid factors and T-cell-dependent antibody isotypes found in CII autoantibodies. T cells react to peptides presented by MHC molecules on antigen-presenting cells, and if proper signals are given they become activated and pass through the lymphatic vessels into the bloodstream. Here they perform their effector function after recognition of their antigen. Because the B cell needs help from a CD4-carrying αβ T cell to be able to respond to an antigen, the T cells have long been investigated for their regulatory ability in RA (reviewed in [61]). Clearly, the activation of αβ T cells is crucial in all animal models for RA studied so far.

T cells might have several roles in arthritis, such as activating B cells to produce pathogenic antibodies, activating macrophages and fibroblasts to destroy the target tissue and helping cytotoxic cells to kill tissue cells. It is uncertain which mechanism is the most important in the various animal models or whether there is a combined effect. It is also likely that many genes that are yet to be found will control the activation of T cells because this is a well controlled step in the activation of the immune system and also in the avoidance of reacting to self tissue.

It has been suggested that polymorphism of the TCR genes could be of importance, because there are studies showing a biased TCR gene usage, although this has been debated. In 1995 suggestive evidence for linkage to the TCR Vb12.2 marker in a study of 28 RA families with 79 affected was published [62]. Another study weakly linked TCR Va8 (odds 1.3) but not TCR Vb to RA in 766 RA patients [63]. In contrast, an investigation in the UK, where 184 RA families with 404 affected siblings were analysed, significantly ruled out the TCRA and TCRB loci as germline-encoded RA susceptibility loci [64]. In support of this negative finding, linkage analyses in animals so far have not provided evidence of an influence of a TCR gene (Fig. 2). The current data suggest that any influence of certain TCR genes is small compared with that of the MHC class II region. However, irrespective of a possible direct association with TCR loci it is likely that many of the genetic regions identified will contain genes controlling T cell function, because T cells are important in the pathogenesis of both CIA and RA.

Figure 2
figure 2

Chromosomal maps showing the location of the identified loci in various rat (top) and mouse (bottom) models for RA. Chromosome positions have been taken from the web sites of either the Mouse Genome Database [119] for mouse markers or and for rat markers. QTL, quantitative trait locus.

The MHC region

An association between the MHC region and RA was shown in 1978 [6]; since then there have been many publications confirming this. It was observed in 1987 that a common amino acid sequence motif in the third hypervariable region of the HLA-DRβ1 protein was shared between the RA-associated HLA-DRB1 alleles 0401, 0404, 0408, 0405 and 0101, and that all of the variant amino acids were located in the peptide-binding pocket of the molecule. This motif encoding the peptide-binding pocket was termed the shared epitope (SE) [65]. It is believed that the SE is responsible for the observed MHC association, although more recent studies have shown associations with DRB1 alleles that do not carry this motif.

Several studies have indicated that classical RA and a more severe and chronic disease course are more strongly associated with SE than cases including more broadly diagnosed RA [66,67,68]. An alternative possibility is that the HLA-DQ locus could be the actual RA susceptibility factor, whereas HLA-DRB1 confers resistance [69]. It is difficult to discriminate between HLA-DR and HLA-DQ because they are in strong linkage disequilibrium, but so far the evidence argues for HLA-DR rather than HLA-DQ. The proposed associated serotypes DQ7 and DQ8 are strongly linked with the RA-associated DR4 allele DRB1*0401, but specifically designed studies do not support an independent role for DQ7 or DQ8 [70,71]. In mice transgenic for human class II molecules it has been possible to show that the development of CIA is associated not only with the SE-containing DRB1*0401/DRA [72,73] and the DRB1*0101/DRA [74] molecules but also with DQ8 [75].

It should be emphasized that a role for other MHC region molecules in the genetic association to RA cannot be excluded; for example, several studies show support for the involvement of the MHC class III region [76,77]. A few candidates in this region are peptide transporters and chaperones such as DM and tumour necrosis factor-α (TNF-α). A polymorphism at amino acid 238 in the promoter region of TNF-α has been implicated in RA; however, this has been shown not to cause deviations in transcription in B cells, T cells or monocytes but instead is in linkage disequilibrium with an unknown genetic factor [78]. Nevertheless, other polymorphisms might influence the predisposition to certain sub-phenotypes of RA as exemplified by an overrepresentation of certain polymorphisms in TNF-α in a study of systemic juvenile arthritis in Japan [79]. Excluding candidates conclusively in this region is difficult in humans because genes are in extensive linkage disequilibrium and alleles of different loci are often inherited together in haplotypes. These problems might be overcome by stratification in large cohorts of human families.

It has been possible in the mouse to identify the major gene in the MHC region that is responsible for the association with CIA. This was shown by modifying the CIA-resistant H-2p haplotype antibody gene into the CIA-susceptible antibody allele from the H-2q haplotype, an exchange of only four amino acids, and expressing this in an H-2p mouse. This transgenic mouse was as susceptible to CIA as the normal H-2q-carrying strain [80]. Interestingly, the peptide-binding pocket of the H-2q A molecule mimics that of a human class II molecule containing the SE [72,81,82]. Accordingly the DR4*0401 and the DR1*0101 molecules bind a collagen peptide that is shifted only slightly from the peptide that the H-2qA molecules bind, and the TCR contact amino acids seem to be shared in mouse and human. Thus, the CIA model not only might use some of the pathways of putative importance in RA but also might mimic some of the molecular interactions.

Although more than 20 years have passed since the first association of the MHC with RA, the genes responsible for this effect, or their role in the disease pathogenesis, are not known. The readiness with which the association of the MHC with disease has been found indicates that this represents a substantial part of the genetic contribution and, perhaps more importantly, is less heterogenic than other contributing genes.


Cytokines are the messenger molecules of the immune system. It is easy to envisage that an abnormal cytokine gene would be able to affect tolerance to self, but although this field has received great attention and many data have been accumulated, the results are largely inconclusive. A vast effort has been put into studying cytokine function in the context of knockout animals, but the results have not always been easy to interpret. It seems that our limited knowledge and our current techniques make it difficult to follow the skewing in the cytokine pattern when one cytokine is eliminated, especially because this often takes place in the microenvironment surrounding a few interacting leucocytes. In studying RA, some investigators have attacked the problem by using a candidate gene approach and some cytokine genes have been suggested, although results are below significance [83]. The most conclusive data come from a study in which an interleukin-10 allele is over-represented in RA patients [84].

Sex-linked factors

Gender is a well-known risk factor for RA, for which females in general have a 2.5-fold higher risk [1]. However, a gender influence might rely on a bewildering number of variables: genetic factors of sex chromosomes, factors related to pregnancy and hormones, and also environmental factors obviously differ between the two sexes. Gender effects have also been observed in animal models [85,86,87,88]. This gender effect is clearly influenced by sex hormones and environmental factors such as behaviour, but there are also direct and indirect genetic influences. Genes on the Y chromosome can severely affect the development of arthritis, such as the Yaa (Y-chromosome linkage autoimmune accelerator) gene that protects from arthritis but markedly enhances the development of lupus [55]. The X chromosome has also been shown to harbour genetic factors that influence the development of CIA. This was shown in a study of male mice from reciprocal F1 crosses that had been castrated to exclude hormone effects [56]. However, no gene has been identified apart from the Xid mutation in the Btk gene that renders animals devoid of B-cells.

An important factor is that the phenotypic importance of certain autosomal genes seems to vary depending on whether they are expressed in a female or a male environment, which could be related to epistatic effects with sex chromosome genes, hormonal effects, imprinting or other environmental factors.

Several groups have investigated the effects of oestrogen and other sex-related factors in RA (reviewed in [87,89]), and both the oestrogen synthase locus and the oestrogen receptor gene have been weakly associated [90,91]. In a study of tandem CAG repeats in the androgen receptor exon 1, it was suggested that male patients with an early onset of RA had significantly fewer repeats than did age-matched controls or late-onset male RA patients [92].

The unknown factors

Linkage studies in RA patients

Up to today, three genome-wide screens for susceptibility loci for RA have been performed in cohorts of human families with RA [93,94,95]. Hardwick et al [94] briefly reported linkage to the HLA region. Shiozawa et al [93] identified, in their limited material of 41 affected sib-pair families, three potential susceptibility loci denoted RA1, RA2 and RA3, of which the strongest locus was RA1 on chromosome 1p36; this was identified by a significant LOD score to two adjacent microsatellite markers. No evidence for linkage to the HLA region was detected in this study. In the larger study by Cornélis et al the only significant linkage was detected for markers in the HLA region. However, 14 other chromosomal regions showed evidence of linkage, four of which were overlapping with loci implicated in insulin-dependent diabetes mellitus (IDDM6, IDDM9, IDDM13 and DXS998) [95].

The Cornélis study was based on RA families collected through The European Consortium on Rheumatoid Arthritis Families (ECRAF). The families included in this collection originate from several countries in central and southern Europe (France, Belgium, Spain, Greece, Italy, The Netherlands and Portugal). Through such a consortium, large numbers of families are available for genetic studies, which is required to obtain statistical power in the analysis. However, the drawback is the introduction of a potentially increased genetic heterogeneity.

Linkage analysis of murine models for RA

Several genome-wide linkage studies identifying genetic regions associated with the development of arthritis have been conducted in both mouse and rat models of CIA [58,96,97,98,99,100,101] as well as in the proteoglycan-induced arthritis model induced with heterologous aggrecan in mice [102]. Furthermore, several adjuvant-induced models have also been used, such as Mycobacterium adjuvant-induced arthritis (AIA) [103], oil-induced arthritis (OIA) [104] and pristane-induced arthritis (PIA) [105]. In addition, arthritis induced with live pathogens, such as Staphylococcus or Borrelia, has been analysed genetically [106].

Today around 40 quantitative trait loci, significantly associated with disease, have been identified (Fig. 2). It is of particular interest that some regions have been detected in several of the investigations, which emphasizes the chances of sharing genetic factors between strains and even species.

Identification of susceptibility genes

The task of identifying the underlying genes is both difficult and cumbersome and it will take some years until the picture becomes clear. In many of the linked regions there are genes that can be postulated from our current knowledge to have a role that could be worthwhile to investigate more closely, although most of the regions are so large that it would be exhausting to investigate all the candidates. Working with animal models provides a more efficient approach to minimizing the regions by producing congenic animals, where a linked region from one strain is bred onto a background strain, permitting a phenotype analysis of each linked region separately. Combinations of different congenic strains permit the elucidation of interactions between susceptibility factors.

This approach can be rewarding, as exemplified by the investigations of murine lupus models. Here, sub-phenotypes of lupus have been found in congenic strains [107,108,109]; by crossing two different congenic strains the lupus disease was partly reconstructed, giving evidence for gene interaction between the different susceptibility loci [110]. The construction of congenic strains gives important information on the function of the susceptibility genes and new methods of selecting and further narrowing the regions so that the genes can ultimately be cloned (reviewed in [14]). Several approaches to the making of congenic animals and other strain combinations to identify quantitative trait loci have been compared and discussed by Darvasi [111].

Future prospects

Altogether a tremendous effort is being undertaken today to find the genes that cause RA; it is to be expected that several of the genes associated with RA will also have importance in other autoimmune diseases. The hope is to find genes involved in critical pathways leading to pathology in these diseases. It would certainly be exciting if several disorders could be explained, at least in part, by proteins involved in common pathways, while other genes controlled the tissue specificity. It has been hypothesized that this is true [112] and also it has been shown that several autoimmune disorders cluster in families [113,114,115,116] and even occasionally in the same patient [117].

Although the expectations of finding genes controlling RA have possibly been exaggerated, the rapid uncovering of genomic sequences and markers will in time be very helpful in larger-scale projects. New techniques are emerging to assist us. The microchip arrays will aid us in more efficiently identifying genes that control pathogenic pathways, as defined in mouse strains or tissue cells. In addition, microchip arrays will provide more efficient ways of genotyping and sequencing, which will be important in direct analyses of the genetics of complex diseases. In addition we shall see an explosion of sequence information from the efforts of the human and mouse genome projects. This will lead to the possibility of making advanced predictions of gene locations and protein structures.

The most efficient and fruitful way in which to understand the genetic control of RA will be through the joint use of animal models and human materials. This will be true for most common diseases.

Owing to the complex inheritance of the common diseases, there are so far few, if any, examples of positional cloning of a complex disease gene. However, a recent report describes the efforts of Horikawa et al [118] to positionally clone a gene in humans affecting the susceptibility to the complex disease type 2 diabetes. Nevertheless, the report illustrates the great challenge of deducing the causative effect of a gene for a polygenic complex disease. Although the authors present the genetic variation in the calpain-10 gene that is associated with type 2 diabetes, there is still the challenge to prove whether this is genuinely causal. This problem, which has so far been underestimated, will soon become obvious to many investigators. The development of genetic manipulative techniques in animals, such as transgenic techniques based on embryonic stem cells, will therefore be of crucial importance in understanding the biological role of the identified genes and in proving their involvement in disease pathology. By defining susceptibility genes in animal models, the relevance to human disease can be tested directly in human materials, and genetic targets can be defined for therapeutic purposes, as suggested in Fig. 1. The future therefore looks as intriguing as ever.