Introduction

Myocardial infarction (MI) is a complication of coronary artery disease (CAD) and the leading cause of death in the Western world. Gene-disease association studies have shown that genetic factors play a role in MI (Ozaki et al. 2002; Yamada et al. 2002; Lusis et al. 2004). However, the results from those studies require independent replication and identification of the functional basis of the association. In addition, life style risk factors play a key role in the pathogenesis of MI, and they must be considered in studies identifying genes for MI (Harrap et al. 2000). The candidate gene studies on MI have produced inconsistent results so far, except for only one case in chromosome 13 (Helgadottir et al. 2004).

Genome-wide linkage scan (search) is an alternative, comprehensive and unbiased approach for identifying chromosomal loci that may be linked to complex diseases such as MI (Risch 1990; Wang et al. 2001). Genome-wide linkage scans in MI have shown that genetic factors play a role in the disease, but the exact inheritance pattern is still unknown. These genome scans on MI have produced inconclusive inferences, because the linkage signals tend to be rather low, the number of families and affected sibpairs are relatively small, and the individual genome scans identify linkage in different chromosomal regions (Dempfle and Loesgen 2004).

It has been shown that the pathogeneses of CAD and MI may have different genetic backgrounds and that MI can be considered a distinct and more restrictive phenotype. Moreover, premature or early onset MI is expected to increase the genetic component involved in the disease (Wang et al. 2004).

Thus, in order to provide more conclusive evidence on regions linked to the disease, a genome search meta-analysis (GSMA) was performed. GSMA is one of the best-established methods for meta-analysis of such data (Zintzaras and Ioannidis 2005a; Levinson et al. 2003; Wise et al. 1999). GSMA has already been applied to genome scans of several diseases (Zintzaras and Ioannidis 2005a, b; Ioannidis et al. 2006; Trikalinos et al. 2006; van Heel et al. 2004). Zintzaras and Ioannidis (2005a, b) developed an extension of the GSMA by evaluating heterogeneity between different genome scans in the context of GSMA (called HEGESMA: heterogeneity based genome scan meta-analysis). In the present meta-analysis, the methodology of heterogeneity based genome scan meta-analysis was applied to available genome scan data on premature MI.

Materials and methods

Eligible genome scans

All whole-genome scans for premature MI published before April 2006 were considered in the GSMA. The studies were identified by extended computer-based search of the PubMed database. As a search criterion we used the combination of the following terms: “genome search”, “genome scan”, “coronary artery disease”, “myocardial infarction”, “CAD” and “MI”. Premature MI was defined when the cases of a study have an average age of less than 56 years, with a standard deviation less than 10, or when the study stated that cases were early onset or premature MI. Scans restricted to specific individual chromosomes or chromosomal regions were excluded. Studies with overlapping subjects were subjected to a sensitivity analysis, i.e. an analysis excluding the smaller study.

Data extraction

For each eligible study the following information was extracted: first author, journal, year of publication, country of recruitment, racial descent of study population, age of participants, number of families, affected individuals, numbers of microsatellite markers, linkage statistic, type of statistical analysis and software of linkage analysis.

Genome scan data across each chromosome were derived from the figures provided in the published papers after digitisation, as used in previous studies (Trikalinos et al. 2006; Zintzaras et al. 2006), and from information provided by results presented in each study.

Genome scan meta-analysis and heterogeneity testing

The GSMA starts by splitting the chromosomes into bins of approximately equal length; usually each bin has a width of 30 centimorgan (cM) giving 120 bins in total for the whole genome (Zintzaras and Ioannidis 2005a; Levinson et al. 2003; Wise et al. 1999). For each genome scan the most significant result of the test statistic obtained within the bin is recorded. Then, for each scan the bins are ranked according to their significance of results, and the ranks for each bin are summed across scans. The significance of the average rank of each bin is assessed empirically against the distribution of average ranks. Under the null hypothesis of no linkage in any chromosomal bin, the ranks are randomly assigned from each study; then, the probability that the ranks X i (where i=1−m studies) from a specific bin sum to R is:

$$ P{\left( {{\sum\limits_{i = 1}^m {X_{i} = R} }} \right)}= 0 \;{\text{for}}\;R < m$$
$$ P{\left( {{\sum\limits_{i = 1}^m {X_{i} = R} }} \right)} = \frac{1} {{n^{m} }}{\sum\limits_{k = 0}^{\text{int} [(R - m)/n]} {( - 1)^{k} {\left( {\begin{array}{*{20}c} {{R - kn - 1}} \\ {{m - 1}} \\ \end{array} } \right)}} }{\left( {\begin{array}{*{20}c} {m} \\ {k} \\ \end{array} } \right)}\quad {\text{for\ }}m \le {\text{ }}R \le nm, $$
$$ P{\left( {{\sum\limits_{i = 1}^m {X_{i} = R} }} \right)}=0\;{\text{for}}\;R > nm$$

When a bin has a high summed rank, then this is considered as evidence for linkage. Equal test statistics for several bins within a study were assigned as tied ranks.

Heterogeneity between studies for each bin was assessed using Q statistic. The Q statistic is defined as the sum of the squared deviations of each study’s bin rank from the mean of the ranks. In GSMA low between-study heterogeneity indicates consistency of study results for the same bin. Then, the presence of low heterogeneity for a specific bin with high ranks in all studies can be interpreted as further supportive evidence for the importance of this bin.

The statistical significance of the average rank and the Q metric were assessed using a Monte Carlo method (Zintzaras and Ioannidis 2005a, b). In the Monte Carlo method, in a run, the ranks of each study are randomly permuted and the simulated average rank and Q metric are calculated; then, the procedure for 50,000 runs is repeated and a null distribution for the average rank, and for the Q metric, is constructed. The significance level (P rank) of the average rank of bins against the null distribution of average ranks is the percentage of simulated average ranks greater than or equal to the observed. The statistical significance level (P Q) for low heterogeneity is the percentage of simulated metrics less than the observed (Zintzaras and Ioannidis 2005a). In addition, the probability of observing a given average rank for a bin by chance, in bins with the same “place” in the ascending order of average ranks in the runs (ordered ranks) (P order), is calculated (Levinson et al. 2003). P rank assesses the significance of each bin independently, whereas P order is based on the distribution of average ranks across all bins simultaneously (Zintzaras and Ioannidis 2005b). Moreover, a Monte Carlo test was performed, generating null distributions separately for each bin, considering only the simulated distributions of the Q metric (Q adjusted) for bins with the neighbouring simulated average rank (±2) as the bin being considered each time (Zintzaras and Ioannidis 2005a). In the present study, GSMA (P rank and P order) and heterogeneity testing was performed unweighted and weighted. In weighted analysis, the ranks of the bins in each study were weighted by \( {\sqrt {[({\text{pedigrees}}) \times ({\text{markers}})],} } \) and then, the weights were scaled to sum up to 1.

The evaluation of the significance of average ranks and the significance of heterogeneity was performed using HEGESMA software (http://biomath.med.uth.gr) (Zintzaras and Ioannidis 2005b).

Results

Five whole-genome scans for premature MI were identified in PubMed (Helgadottir et al. 2004; Wang et al. 2004; Broeckel et al. 2002; Hauser et al. 2004; Samani et al. 2005). In these scans, overlapping of cases was not reported. Details on the analysed studies are shown in Table 1. The study by Broeckel et al. (2002) consisted of 513 pedigrees with 944 affected individuals (Caucasians); 394 microsatellite markers and multipoint analysis were used. In the study by Hauser et al. (2004) the genome scan was performed in 228 pedigrees mainly from a Caucasian population (93%); the number of affected individuals was not reported; the scan used 395 markers and multipoint analysis. In Wang et al. (2004), 428 pedigrees containing 712 affected persons were evaluated using 408 markers and multipoint analysis. The study included 92% Caucasians, and the remaining proportion was a mixed population. However, in the meta-analysis, only the Caucasian data were used. In Helgadottir et al. (2004) the genome scan was performed with 194 affected individuals from 93 pedigrees, and the scan was performed with 1,068 markers and multipoint analysis; the study does not provide demographic data, but it is stated that the authors performed analysis on individuals with early onset MI. Finally, the Samani et al. (2005) study consisted of 1,176 affected persons from 847 pedigrees; 398 markers and multipoint analysis were used.

Table 1 Characteristics and major results of premature MI genome scans

Linkage statistics was LOD score in all studies, except that of Wang et al. (2004) (P values were used). In weighting, the studies by Broeckel et al. (2002) and Hauser et al. (2004) produced the least weight, with a weight factor w=0.15, and the study by Helgadottir et al. (2004), with the most weight (w=0.28). The chromosome regions with suggestive linkage identified from each individual genome scan are shown in Table 1.

Figure 1 shows the average ranks for each bin from the five genome scans using 120 bins. The bins with significant P rank in unweighted or weighted analysis are located above the line at P≤0.05. The significant bins (P≤0.05), the observed ranks, the GSMA and heterogeneity statistics for each study are shown in Table 2.

Fig. 1
figure 1

Unweighted (open circles) and weighted (filled circles) average ranks from five premature myocardial infarction genome scans with 120 bins. Bins with significant P rank in unweighted or weighted analysis are above the line at P≤0.05

Table 2 Premature myocardial infarction genome-scan meta-analysis results of five scans, showing chromosomal bins significant in average rank either for unweighted or weighted analyses (in brackets are the weighted results)

Eight bins were found to have P rank≤0.05 by either unweighted or weighted analyses (bins 6.2: 6p22.3–6p21.1, 14.1: 14p13–14q13.1, 13.4: 13q33.1–13q34, 5.1: 5p15.33–5p15.1, 8.4: 8q13.2–8q22.2, 1.2: 1p36.21–1p35.2, 12.6: 12q24.31–12q24.33, 8.6: 8q24.21–8q24.3) and five of them (bins 6.2, 14.1, 13.4, 8.4, 1.2) had P rank≤0.05 with both methods (Table 2). None of bins was significant in order statistics for the unweighted or weighted analysis (P order>0.05). Bin 8.6 produced marginal, significant, low heterogeneity in unweighted and weighted analysis (P Q=0.053 and P Q=0.053, respectively). In addition, the Q adjusted was significant in both unweighted and weighted analysis (P Qadj=0.033 and P Qadj=0.036, respectively). Thus, bin 8.6 provided evidence of linkage in terms of average rank, with low heterogeneity between genome scans. The remaining seven bins showed linkage but with large heterogeneity. Four of the significant bins (6.2, 14.1, 8.4, 8.6) were novel, i.e. they were not identified by the individual studies.

Discussion

The present meta-analysis has identified a total of eight chromosomal regions with evidence for linkage from P rank. However, only five regions (bins 6.2, 14.1, 13.4, 8.4, 1.2) were significant in both unweighted and weighted analysis. The chromosome regions identified by the meta-analysis that have not been observed in individual genome scans are bins 6.2, 14.1, 8.4 and 8.6. None of the chromosomal regions indicated evidence for linkage from P order statistic.

Heterogeneity testing revealed low unadjusted and adjusted heterogeneity only at bin 8.6 in both weighted and unweighted analysis. Bin 8.6 includes one gene investigated for conferral of susceptibility to MI: CYP11B2 (Hengstenberg et al. 2000). Regions with significant P rank, and large heterogeneity, imply that only a subset of studies can show linkage, and, therefore, there is no consistency of linkage across scans. Extended studies are needed to investigate these regions further. However, these results do not exclude the possibility that other chromosomal regions that confer susceptibility to MI, which are not shown in the present study, exist, and that regions with linkage may exist in one or few populations.

Bin 8.4 is a novel region identified by the GSMA. In this bin there are no known genes that confer susceptibility to MI. Thus, it is suggested that genes at these regions confer susceptibility to disease and that these regions can be a challenge for the identification of candidate genes for MI. In bin 6.2, the first most important region for MI, are located various known genes (MOG, HSPA1A, LTA, TNF, AGER, HFE, HLA-DR and C4) with a possible implication to MI (Ozaki et al. 2002; McCarthy et al. 2004; Bolla et al. 1998; Koch et al. 2001; Hofmann et al. 2005; Tuomainen et al. 1999; Sperti et al. 2000; Kramer et al. 1994). Bin 14.1 is the region where genes CAQ14 and PSMA6 are located. Gene CAQ14 is currently being investigated in association with obesity and the metabolic syndrome (Comuzzie et al. 2001). In a recent large-scale association study in a Japanese population, a functional single nucleotide polymorphism in the PSMA6 gene was found to confer risk of MI (Ozaki et al. 2006). This association has not yet been studied in other ethnic groups. Our analysis has highlighted bin 14.1 as the second most significant region, without including populations of Japanese origin. These findings indicate gene PSMA6 as a potentially important candidate gene for association studies with MI in Caucasian and other populations. Bin 13.4 is the region where various known genes are located, such as F7, F10 and IRS2. These genes were identified in a large-scale association analysis for premature coronary heart disease (McCarthy et al. 2004). Bin 5.1 is the region of the MTRR gene, which might be implicated in MI (Chen et al. 2001). Bin 1.2 is the location of MTHFR and ECE1 genes, which are currently being investigated in relation to MI (McCarthy et al. 2004; Zee et al. 2006). Bin 7.6 contains the SCARB1 gene, which is possibly implicated in the development of MI (McCarthy et al. 2004).

The populations involved in the meta-analysis were Caucasians, and, therefore, any heterogeneity could be attributed more to genuine inconsistency of genetic effects and to differences in studies’ design and conduct, and less to differences across populations.

Although, conventionally, HEGESMA is based on bins with width of 30 cM, an analysis with bin widths less than 30 cM could provide some more information on possible regions with linkage. An analysis using more than 120 bins was omitted, due to lack of regions with really very strong evidence of linkage. HEGESMA does not currently consider X and Y chromosome data, thus conclusions concerning possible linkage on those chromosomes cannot be derived. In this analysis decisions on multiple tests have not been made, since heterogeneity based genome scan meta-analysis is an exploratory non-parametric procedure interested in the relative significance of the regions (Rothman 1990). Other limitations of the meta-analysis are the variable map density within and between studies and the lack of power analysis. The power analysis methodology is not yet available or established (Levinson et al. 2003; Lewis and Levinson 2006); however, it will be incorporated in the new version of HEGESMA. An alternative approach for meta-analysis of linkage data is to construct a combined map of the markers (Kong et al. 2002) from the original genotypes for each study and perform new linkage analyses. Another approach is to combine P values after correcting for the size of the linkage area (Badner and Gershon 2002). The merit of using HEGESMA over the alternative approaches (Levinson et al. 2003) is that (1) it requires only the placing of markers within bins and not the determination of precise positional relationships; (2) it does not require raw data; (3) results from several genetic analyses performed in a particular study can be maximised to produce a single set of ranks; (4) it requires no assumptions about models of inheritance; (5) it provides a test of genetic heterogeneity.

In conclusion, the genome scan meta-analysis and heterogeneity testing in MI provides some evidence of linkage for four new candidate regions (bins 6.2, 14.1, 8.4 and 8.6). Thus, genotyping these regions with additional markers and families may identify candidate genes that confer susceptibility to MI.