Introduction

Stroke is amongst the most common causes of death and disability worldwide1. Major advances have been made in the understanding of the pathophysiology of stroke and in vitro and animal experiments have suggested numerous substances as promising candidates for treatment of the disease2,3. However, although hundreds of these substances have been tested in clinical trials, thrombolysis is still the only specific pharmacological treatment proven efficacious in acute ischemic stroke2. The apparent difficulty of transferring results from experimental studies to the clinical situation (“from bench to bedside”) has been referred to as a “translational roadblock”2,4 and the possible reasons behind it, particularly lack of methodological quality, has been discussed intensively over the last years5,6. Low statistical power as a results of high outcome variability and mortality in combination with a small group sizes has been suggested to be an important issue5 and although this can theoretically be overcome by increasing the group sizes enough, such a solution has several problematic implications. From an ethical point of view, it is recommended to use as few animals as possible according to the “three R principle”7 and working with large number of animals is both practically inconvenient (time and space consuming) and costly. Therefore, as a complement, it would be attractive to optimize the animal model by minimizing unnecessary outcome variability and mortality, or at least be able to power studies more exactly by predicting variability and mortality given a certain experimental setup.

Simplified, the standard approach in the majority of the preclinical stroke studies consists of three steps: 1) focal cerebral ischemia is induced in rodents, 2) some kind of treatment is administered and 3) outcome, most often by measuring infarct sizes, is assessed. These basic steps are employed in hundreds of publications each year but unfortunately no consensus exits regarding the ideal setup, and since the variations in methodological factors are innumerous, it is very complicated to experimentally evaluate all possible combinations. In an attempt to address this question, we performed a hypothesis-driven meta-analysis in 2013 studying method parameters’ impact on mortality and variability in rat stroke experiments8. However, since the previous study only used data from rat studies, and since mice are becoming increasingly popular in the preclinical stroke field, we decided to perform a similar analysis on mice. Thus, the objective of the current study was to investigate the effect of methodological variables on infarct size variability and mortality in mouse stroke experiments. Specifically, eight a priori hypotheses concerning factor-outcome relations were formulated:

  1. 1

    Middle cerebral artery occlusion duration affects (A) infarct size variability and (B) mortality.

  2. 2

    Type of focal cerebral ischemia procedure affects (A) infarct size variability and (B) mortality.

  3. 3

    Mouse strain affects (A) infarct size variability and (B) mortality.

  4. 4

    In studies using the intraluminal filament method, the type of occluding filament affects (A) infarct size variability and (B) mortality.

Results

Regression models

The regression model addressing hypotheses 1A, 2A and 3A included 500 control groups while the analysis for hypothesis 4A included 430 (Fig. 1). The r2 values were 0.22 and 0.26, meaning that 22% and 26% of the variation in the outcome measures Infarct size coefficient of variation were explained by the models, respectively. The two models analyzing impact on Mortality rate, one for hypotheses 1B, 2B and 3B and one for hypothesis 4B, included 80 and 73 control groups, respectively. The resulting r2 values were 0.72 and 0.78.

Figure 1: Article inclusion. A total of 2118 articles were assessed for inclusion.
figure 1

After exclusion according to criteria (A–G), 334 articles describing 500 control groups remained. All control groups could not be used for all hypotheses due to lack of essential information; the number of control groups included in each analysis are specified in the thick-boarded boxes.

Impact of occlusion duration on infarct size variability and mortality (hypotheses 1A and 1B)

Regarding the effect of Occlusion duration on the outcome Infarct size coefficient of variation, only the category Permanent turned out to significantly decrease the variability compared to the reference category Short transient (−8.6%, CI: −15.3 to −1.9%; p = 0.012; Fig. 2a). No impact of Occlusion duration on Mortality rate was found (those categories were removed in the backward exclusion step of the statistical analysis and therefore not presented in Fig. 3).

Figure 2: Method parameters’ impact on infarct size variability.
figure 2

Bars represent change in Infarct size coefficient of variation, measured in absolute percent units. Significant p-values are black, non-significant p-values are grey. N = 500 for (a–c) N = 430 for d. Error bars represent 0.95 confidence intervals. CV = Coefficient of variation [calculated as standard deviation/mean]; MCAo = Middle cerebral artery occlusion.

Figure 3: Method parameters’ impact on mortality rate.
figure 3

Swiss strain was found to significantly increase mortality rate compared to the reference C57BL6. The variables Occlusion duration, Type of middle cerebral artery occlusion procedure and Occluding filament type were removed in the backward exclusion step of the regression model due to small explanatory value and therefore results of hypotheses 1B, 2B and 4B could not be presented. Bars represent change in Mortality rate, measured in absolute percent units. Significant p-values are black, non-significant p-values are grey. N = 80. Error bars represent 0.95 confidence intervals.

Impact of type of focal cerebral ischemia procedure on infarct size variability and mortality (hypotheses 2A and 2B)

In the analysis of cerebral ischemia procedures, the Emboli/clot method strongly augmented the Infarct size coefficient of variation (+25.9, CI: +8.2 to +43.6; p = 0.004; Fig. 2b) in comparison to the reference category Intraluminal Filament. Mortality rate was not significantly affected by cerebral ischemia procedure (variables removed during the backward exclusion procedure).

Impact of mouse strain on infarct size variability and mortality (hypotheses 3A and 3B)

Strain affected both Infarct size coefficient of variation and Mortality rate significantly. Overall, the majority of the strains seemed to increase the variability compared to the reference category C57BL6, with the strongest positive regression coefficient being found for Mixed C57BL6/129 (+22.8%, CI: +12.5 to 33.1%; p < 0.0001; Fig. 2c) and 129 (+15.9%, CI: +8.3 to 33.1%; p < 0.0001; Fig. 2c). The only strain category that significantly reduced the variability compared to the reference was Swiss (−5.7%, CI:−11.2 to −0.3%; p = 0.038; Fig. 2c). Except for the reference, two strain categories were included in the mortality analysis and only Swiss had a significant impact by increasing the Mortality rate (+24.2%, CI: +16.2 to +32.2%; p < 0.0001; Fig. 3).

Impact of filament coating type on infarct size variability and mortality (hypotheses 4A and 4B)

In the filament subanalyses, including only articles where the intraluminal filament method had been used, none of the coating type categories (Occluding filament type) seemed to affect the infarct size variability. Although the categories remained in the final enter model, the regression coefficients were small (Fig. 2d). Regarding Mortality rate, coating categories did not make it through the backward exclusion (hence, they were not significant).

Background data

The Infarct size coefficient of variation (in the total 500 control groups9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326,327,328,329,330,331,332,333,334,335,336,337,338,339,340,341,342) was on average 29.5 ± 19.2% (range 0.9–135.5%) while Mortality rate (calculated from the 80 control groups reporting this) was 14 ± 12% (range 0–83%). Number of animals per group was on average 8.4 ± 3.1 (range 3–26). The reported body weight group means were on average 25.6 ± 4.0 g (range 18–45). The average time from induction of cerebral ischemia until sacrifice and damage evaluation was 65.0 ± 104.5 h (range 1.5–1008 h), with a median of 24 h. Frequencies of the different categories of the categorical variables are presented in Fig. 4.

Figure 4: Frequencies of registered categories in the 500 control groups.
figure 4

The figure also includes variables that were omitted from statistical analysis due to too few articles providing these data. Some variable names are abbreviated, see Table 1 for extended descriptions. EEG = Electroencephalography; B = Blood; MCAo = Middle cerebral artery occlusion; TTC = Triphenyl tetrazolium chloride; ECA = External carotid artery; CCA = Common carotid artery.

Discussion

The current study shows that the use of Swiss and C57BL6 mice as well as Permanent occlusion of the middle cerebral artery renders the lowest infarct size variability. Emboli/clot methods, although represented by few control groups, increased variability. Of the methodological factors investigated, only Swiss mice was found to have a significant impact on Mortality rate by increasing it compared to the reference strain. Effect sizes were large, with many parameters changing the outcomes more than 10% in absolute terms. In addition to the findings pertaining to the hypotheses, several other interesting observations were made, such as the beneficial effects of Laser Doppler surveillance on Infarct size coefficient of variation and that Mortality rate was higher with Elderly mice. However, since this study was designed as a hypothesis-driven meta-analysis, results not related to the factor-outcome relations stated a priori should be interpreted with caution and considered merely hypothesis-generating (nevertheless, all findings are presented in Tables S1, S2, S3 and S4 in the Supplementary for readers with special interest in certain methodological parameters).

As mentioned above, comparing all possible combinations of methodological factors experimentally would be a tedious endeavor. However, there are example of studies that investigated one or a few parameters in order to optimize the ischemia model. The majority of these focused on different mouse strains and they did not specifically present or statically compare effect on outcome variability. However, the coefficients of variation can be calculated from mean infarct size and standard deviation similarly to what was done for the regression model in the current meta-analysis. In line with our findings, 129 mice tended to have smaller infarcts with larger infarct size variation compared to C57BL6343,344,345,346, although the extent of difference varied. Not corroborated by the current meta-analysis, two of these studies also included BALB/c in the comparison and found that this strain produced infarcts even bigger than those of C57BL6 but with smaller coefficient of variation344,345. One of the studies presented mortality and concluded that BALB/c had the highest rate, C57BL6 the lowest and 129 was in between the other two strains344. We found an increased mortality with the Swiss strain, but only two other categories were represented in that analysis, C57BL6 and other strains.

A few previous articles describe the effects of different middle cerebral artery occlusion durations but the results are discordant. Similar to what we found, both Tsuchia et al.347 and Mao et al.348 reported lower coefficients of variation for permanent occlusion compared to transient while in another study, the results were the other way around343. In a study with occlusion durations corresponding to our categories Short transient (up to 60 min) and Long transient (>60 min), short transient occlusion was more favorable in terms of infarct size variability. Regarding mortality rate, similar inconsistency was found with one study presenting lower values for transient occlusion347, and one for permanent349.

Proper comparisons between methods for ischemia induction in mice are lacking in the literature. This lack is probably explained by the high cost of introducing a new MCAo method in a laboratory, emphasizing the importance of meta-analyses like the current as an alternative. One study looked at the effect of Poly-L-lysine but, like us, found no effect350. Filament coating length351,352 and filament size347,353 has been investigated but these parameters were not included in our study due to poor reporting in the included articles.

When comparing the current study with the previous rat meta-analysis (described above), some aspect are worth commenting. Similar to what was described herein, emboli methods were found to render larger coefficient of variation of the infarct size than filament, direct and photothrombosis methods8. However, infarcts induced by endothelin (not represented in the current mice analysis) were even more inconsistent. Further, although not included in the main hypotheses of the rat study, permanent ischemia had the lowest variability when comparing different occlusion durations both for rats and mice8. The rat and mice studies also differ regarding some parameters. For example, no significant differences were found for mice between types of coatings in the filament subanalysis, whereas silicone decreased variability for rats8.

The main problem with high infarct size variability is the resulting lack of statistical power if the sample sizes are not adjusted accordingly, which has been discussed in several reviews5,354,355. Statistical power (1-β) is often discussed in relation to negative findings, e.g. to evaluate if a study was adequately designed to detect a treatment effect of a substance and hence if the negative results are to trust or not. However, statistical power is of importance also for studies with positive findings (i.e. when a treatment effect is found)5,356. Low statistical power is associated to the publication bias phenomenon since negative findings are generally less likely to be published, which can distort interpretation of meta-analyses357. To support the claim that statistical power in experimental stroke studies is often low, the average power of the studies included in current meta-analysis can be calculated based on the extracted data: The average group sizes were 8.4 and the average coefficient of variation for infarct sizes 29.5%, which at a significance level of 0.05 gives a power of 59% to detect a 30% difference between groups (calculation based on parametric comparison between two-groups, for more three groups or more and non-parametric methods, the number would be even lower). Ethical boards demanding researchers to minimize number of animals (the three Rs principle7) might explain why too small group sizes are often used, but economic as well as practical aspects are also likely to contribute. Lack of adequate statistical training or no available statistician to consult regarding these issues should also be mentioned as an option. So in addition to optimizing the model to produce consistent lesions and minimize mortality, it is important to perform a priori power calculations in order to avoid the abovementioned problems.

The issue of mortality is somewhat related to outcome variability and power calculations in that higher mortality require larger group sizes to attain sufficient power. However, there is another side to the problem as well. Regarding the statistical analysis, it is not uncomplicated to incorporate mortality in the standard parametric methods which might explain why this information in most cases is not even mentioned. A non-parametric approach, with mortality included as worst possible outcome, is an option that has been utilized in our laboratory358,359 but either way, the absolute minimum should be to report these data. The risk when omitting mortality rate data can be illustrated by the possible scenario of a toxic substance that seems to decrease infarct sizes compared to a placebo group, only because all mice with large infarcts in the treatment group died. In the current meta-analysis, it might seem surprising that the effects on mortality were generally moderate (e.g. no significant effect of occlusion time). However, mortality data was only available for 80/500 control groups. A low number of observations weakens a regression model with many predictor variables, and this should be considered when conclusions are drawn.

The main strength of the present meta-analysis is the large number of articles included, and that the effects of many methodological factors are investigated together in one single statistical model. However, this approach is relatively novel, warranting a discussion about some aspects of the design:

- The impact of each control group were weighed according to number of animals which might be problematic when analyzing coefficient of variation, since researchers knowing that they have large variability in their model probably compensate by including more animals.

- The effect of publication bias has to be considered, as studies with large coefficients of variation might produce negative results that are more likely to remain unpublished.

- Although many possible confounders were recorded and controlled for, accounting for all details of the included experiments is beyond the reach of even a meta-analytical approach. Impact of different vendors and skill of the surgeon are just a couple of factors that could not be assessed. For mathematical reasons, categories have also, as described in the Methods section, been reduced to larger categories, meaning that differences within categories may be lost.

- 500 control groups are included but only 334 articles, meaning that several articles contributed with more than one control group. It is not strictly statistically appropriate to analyze these independently but creating categories for all unique studies would have made the statistical analysis impossible.

In conclusion, the methodological choices are of major importance for consistent results and advantageous animal models. However, although it may be relevant to adjust the experimental setup to minimize infarct size variability and mortality rate, other important components such as similarity to the clinical situation have to be taken into consideration. For this reason, it might be motivated in some studies to use the emboli method or elderly animals even though this might increase the outcome variability and mortality, respectively. In either case, the current study enables a more precise estimation of variability and mortality a priori given a certain experimental setup, thereby facilitating proper power calculations.

Methods

Overview

The basic outline for the study was pre-defined and consisted of the following steps:

  1. 1

    Variables to be studied were chosen.

  2. 2

    Data about chosen variables were extracted from relevant articles.

  3. 3

    Variable categories were refined based on extraction results.

  4. 4

    Statistical analyses were performed on variables left after refinement.

Article inclusion

Relevant articles were identified in the Medline database via PubMed using the search string (mcao or “middle cerebral artery occlusion” or “MCA occlusion” or “stroke” or “cerebral ischemia” or “brain ischemia”) and (mouse or mice), resulting in over 6,000 hits. The articles were consecutively assessed for inclusion, in order of PubMed identifier, starting with the most recent article January 9th 2012. The inclusion criteria were:

  1. I

    Article written in English.

  2. II

    Original research article.

  3. III

    Experiments performed using living mice.

  4. IV

    Mice inflicted one single focal cerebral ischemic lesion.

  5. V

    Infarct sizes measured and results presented.

  6. VI

    Inclusion of a control group, untreated except for vehicle/placebo treatment.

  7. VII

    Experiment adequately described.

Data extraction

Control group data were extracted from all included articles. If an article described more than one control group, differing in any methodological aspect, these were included separately and analyzed independently. The principle “if it was not described, it was not performed” was adhered to throughout the process. Methodological factors to be extracted were chosen based on our previous rat meta-analysis8 and personal experience. See Table 1 for a complete list of all variables that we intended to extract. The goal was to gather as much relevant data as possible in order to build a good statistical model.

Table 1 Extracted factors and outcome measures.

To perform a proper power calculation for such a large multiple regression model is a very complex task. Instead, the sample size estimation was based on our previous meta-analysis with a similar design. Furthermore, we performed interim saturation analyses after 400 and 450 included control groups to check when the results had stabilized, i.e. no changes in overall trends occurred. In total, 500 control groups from 334 articles (see Supplementary methods for a complete list of references) were included and 1784 articles were excluded (Fig. 1).

Processing of data

Category refinement

To avoid small categories being attributed statistically unsubstantiated explanatory value, categories represented by less than 5 control groups were pooled in an Other category for that specific variable. The overall effects on the two outcome variables (Infarct size variation and Mortality rate; hypothesis 1A, 1B, 2A, 2B, 3A and 3B) were tested in two independent models and in addition, the filament method subanalysis (hypotheses 4A and 4B) had to be performed separately. Each of the resulting four regression models comprised different numbers of control groups since not all articles reported on mortality and obviously only studies using the intraluminal filament model could be included for the filament subanalysis. Hence, in some cases a category represented by more than 5 control groups in one regression model was reduced to less than 5 groups in another and thus incorporated in the Others category, in line with the general category size principle described above. See Supplementary methods for a detailed description of processing of data. Also, in Tables S1, S2, S3 and S4 (Supplementary) the final categories for each regression model are presented.

Excluded variables

The following variables were originally intended to be incorporated into the model, but since none or very few articles reported these data they had to be omitted: Diseases, Intubation, EEG supervision, Postoperative antibiotics, Filament tip diameter, Filament coating length and Exclusion rate.

Statistics

As described above, eight main hypotheses were stated a priori:

  1. 1

    Middle cerebral artery occlusion duration affects (A) infarct size variability and (B) mortality.

  2. 2

    Type of focal cerebral ischemia procedure affects (A) infarct size variability and (B) mortality.

  3. 3

    Mouse strain affects (A) infarct size variability and (B) mortality.

  4. 4

    In studies using the intraluminal filament method, the type of filament affects (A) infarct size variability and (B) mortality.

Since large multiple regression models may suggest a wide range of unexpected associations between variables, a limited set of predefined hypotheses were established to lower the risk of finding falsely significant results due to multiple comparisons (type I errors). Findings not related to these were interpreted with caution and considered merely hypothesis-generating. Due to the risk of type II-errors, corrections for multiple comparisons were not performed.

All categories were dummy-converted before analysis (Table 1). For binomial variables, lack of a specific methodological factor, i.e. [No], was considered the reference category whereas the most common category was chosen as baseline for variables with more than two categories. The data were analyzed using weighted multiple linear regression in two steps. First, a backward exclusion procedure identified factors that contributed significantly to the model and removed the rest. Subsequently, an enter model was performed, in which significant factors identified was manually complemented by lacking dummy variables that were excluded in the previous step (presented in Table S1, S2, S3 and S4). Weighing of cases was performed according to the number of animals in each control group; hence, groups with more animals had larger impact on the statistical model than groups with few animals. Based on the hypotheses, four regression models (one for hypotheses 1A, 2A and 3A; one for hypotheses 1B, 2B and 3B; one for hypothesis 4A and one for hypothesis 4B) were built to test the combined effects of all factors on the two separate outcome measures, Infarct size coefficient of variation or Mortality. In this way, when investigating one of the specific hypotheses, the model controlled for the other predictor variables. The models passed residual checks and multicollinearity tests. All statistical calculations were performed in SPSS (Version 23, IBM Corporation, Armonk, NY, USA). P-values <0.05 were considered significant. Regarding results from the meta-analysis, 95% confidence interval were provided, otherwise data were presented as mean ± standard deviation.

Additional Information

How to cite this article: Ingberg, E. et al. Method parameters' impact on mortality and variability in mouse stroke experiments: a meta-analysis. Sci. Rep. 6, 21086; doi: 10.1038/srep21086 (2016).