Survey of water proton longitudinal relaxation in liver in vivo

Objective To determine the variability, and preferred values, for normal liver longitudinal water proton relaxation rate R1 in the published literature. Methods Values of mean R1 and between-subject variance were obtained from literature searching. Weighted means were fitted to a heuristic and to a model. Results After exclusions, 116 publications (143 studies) remained, representing apparently normal liver in 3392 humans, 99 mice and 249 rats. Seventeen field strengths were included between 0.04 T and 9.4 T. Older studies tended to report higher between-subject coefficients of variation (CoV), but for studies published since 1992, the median between-subject CoV was 7.4%, and in half of those studies, measured R1 deviated from model by 8.0% or less. Discussion The within-study between-subject CoV incorporates repeatability error and true between-subject variation. Between-study variation also incorporates between-population variation, together with bias from interactions between methodology and physiology. While quantitative relaxometry ultimately requires validation with phantoms and analysis of propagation of errors, this survey allows investigators to compare their own R1 and variability values with the range of existing literature. Supplementary Information The online version contains supplementary material available at 10.1007/s10334-021-00928-x.


Introduction
The liver longitudinal water proton relaxation rate R 1 is important for several reasons. Native R 1 is a biomarker of liver pathology [1,2]. Also, other liver biomarkers are secondarily derived from R 1 measurements: for example, increase in R 1 post-gadoxetate is a biomarker of hepatocyte function [3,4]; extracellular volume is derived by comparing R 1 pre and post contrast [5]; and baseline R 1 is required for rate constants in dynamic contrast-enhanced MR [6], for tissue oxygen tension in oxygen-enhanced MR [7], and for relaxivity measurements in contrast agent research [8].
Measurements of R 1 in individual livers or liver regions suffer from both systematic errors and random errors [9]. Systematic errors (bias) arise because measurements are imperfectly performed. Other systematic deviations occur because different methods, even when perfectly performed, yield R 1 values with different dependences on liver composition and physiology. Random (repeatability) errors arise from physiologic and instrument noise, and can be high particularly when regions-of-interest are small. In addition, even in the absence of bias and noise, there are, in each study, genuine between-subject differences in R 1 due to between-subject variation in physiology or subclinical pathology.
To mitigate the effects of random error in establishing a "normal" or "baseline" liver R 1 , investigators sometimes employ a "compromise" R 1 , averaged from all subjects in their study. This likely reduces the "noise" variance, but introduces other errors by ignoring true between-subject variation. Other investigators may obtain R 1 from literature reports, although this will introduce additional bias if different measurement methods had been used, or different populations had been studied.

3
The aim of this study was to survey values, and variabilities, of normal liver R 1 from the published literature. This would give investigators an indication of whether the liver R 1 or T 1 values and variabilities they measure are broadly consistent with, or discordant from, the prior literature.

Literature searching
Literature was searched manually using "Ovid Medline" (www. ovid. com) for "magnetic resonance imaging" AND "liver" AND "relaxation". Additional literature reports were retrieved from citations, supplemented by a more intensive search for data with B 0 = 4.7 T, 7 T, 9.4 T, 11.7 T, 14.1 T or 21.1 T (see supplementary material 1 for further details). Liberal inclusion criteria were employed: any report, in any language, which claimed to measure liver R 1 or T 1 was included, irrespective of methodology or study design. Studies where B 0 was unclear, or where liver R 1 or T 1 was measured but not reported, were necessarily excluded. Studies using Look-Locker methods were included if they reported T 1 or R 1 , but excluded if they reported an apparent T 1 * only. Human and rodent subjects were included if they were normal controls of any age, if the study reported normal parts of livers with focal disease, or if they were patients in whom no liver abnormality had been found. Studies of definitely pathological liver, suspected duplicates, and ex vivo studies were excluded.

Analysis
The mean and variance of R 1 across all subjects in each study was estimated from the publications, with the coefficient of variation given by CoV = √ variance∕mean . Where measurements were made on the same subjects using the same method (repeatability), the weighted mean ± SD was used, however where measurements were made on the same subjects using different method (e.g., different field strengths) the measurements were treated as if from two different studies. Any R 1 measurement method was allowed, as long as T 1 (s) or R 1 (s −1 ) was reported. Where T 1 ± SD was reported, a point estimate of R 1 was estimated as T 1 −1 and the between-subject variance in R 1 was estimated (see supplementary material 2) as: In a few cases, the between-subject variance in R 1 was estimated from a bar or scatterplot depicted in the publication, or from the range rule [10]. To aggregate the data, individual studies were weighted by the inverse of their between-subject variance in R 1 . Studies with N = 1, or where a variance could not be extracted, were included in Figs. 1 and 2, but their R 1 was assigned zero weight in the fits. In addition, a method to account for the well-known B 0 -dependence of liver R 1 [11-15] was needed. Two methods of representing this B 0 dependence were used: a heuristic log-log relationship, and a biophysical power-law model developed by Diakova et al. [12]. R 1 was fitted to B 0 using the weighted non-linear least squares function nls() in R[16] (see supplementary material 3). The fitted parameters in the heuristic were M and C: The fitted parameters in the model were A and B: where R 1,∞ is the high-frequency asymptote, i.e., the extreme narrowing condition, set here to 0.213 s −1 at 310 K[17]; D is the translational correlation time from Diakova et al. [12] adjusted for temperature to 1.43 × 10 -11 s; k = −0.6 also from Diakova et al. [12]; and = 2 × 42.58 × 10 6 × B 0 s −1 . In the summaries, lower (LQ) and upper (UQ) quartiles, and medians, are reported. For exploratory fits using other weightings, see Supplementary Material 4.
Some studies reported that they suppressed fat, and/or corrected for iron-induced T 1 -shortening; some reported motion suppression, registration, triggering, gating or breath-hold; some reported B 1 correction or phantom-based validation. Some studies analysed quite small regions of interest often avoiding blood vessels and bile ducts; others included most or all of the liver. Seventeen field strengths were included between 0.04 T and 9.4 T. No values were found in reports using B 0 > 9.4 T: one report of T * 1 = 1.0 ± 0.1 s at 14.1 T was excluded [127]. Figures 1 and 2 show plots of R 1 against B 0 , in which R 1 shows the expected decrease with increasing field: Table 1 gives values for the most important field strengths. The fit to Eq. 2 gave M = −0.3611 ± 0.0115 and C = 0.2956 ± 0.0073 . The fit to Eq. 3 gave A = (8.663 ± 0.681) × 10 4 and B = (1.294 ± 0.082) × 10 9 . An exploratory attempt at a three-parameter fit to Eq. 3 (i.e., to A, B, and R 1,∞ ) failed to provide evidence for R 1,∞ > 0 (supplementary material 4). When data were subgouped by species or by method, no evidence was found that the subgoup R 1 values deviated systematically from Eq. 3 (supplementary material 6). Across all studies, the median betweensubject CoV was 9.1% (LQ 5.9%, UQ 16.5%, rms 17.0%). There was, however, a tendency for early studies to report high between-subject CoV ( Fig. 3 and supplementary material 7): no study published after 1992 had CoV ≥ 20%, and for post-1992 studies the median between-subject CoV was 7.4% (LQ 5.6%, UQ 11.0%, rms 9.6%). In half those studies, the measured R 1 deviated from Eq. 3 by 8.0% or less (LQ 2.8%, UQ 16.6%).
At each field strength, there was considerable variation in R 1 between studies: the between-study CoV was 16% for post-1992 studies. Six publications [2,37,98,119,128,129] also reported liver R 1 repeatability (same subject, different scan, same measurement conditions): the rms CoV was 1.9%. These CoVs allowed a crude estimate (supplementary material 8) of the relative size of the three main variance components: repeatability variance contributed ~ 1%; within-study-between-subject variance contributed ~ 25%; and between-study variance contributed ~ 74%.

Discussion
In liver, as in pure water, both intramolecular and intermolecular water 1 H-1 H dipolar relaxation contribute to R 1 . Specific additional contributors to water 1 H R 1 in liver arise from 1 H-1 H dipolar relaxation between water and other molecules, and 1 H-electron dipolar relaxation between water and various iron-or copper-containing substances or dioxygen. These 1 H-containing and unpaired-electron-containing substances differ in concentration between subjects. The liver 1 H resonance arises mostly from tissue water in hepatocytes. Other contributions come from water in other intracellular compartments (e.g., Kupffer cells, erythrocytes), and in extracellular compartments (e.g., bile, plasma, space of Disse). Signal from triglyceride and inflowing blood may contribute, depending on the sequence used. Macromolecules contribute to the signal, notably collagen and glycogen which have different concentrations in different subjects. These factors likely account for some of the variation between subjects and between studies. Fits from the heuristic and from the model were very similar. The main difference is that the heuristic forces R 1 to zero at infinite field, while the model forces R 1 to asymptote in the extreme narrowing condition. This difference might become important at fields above 7 T (Fig. 1). In this study, following Diakova et al. [12], the asymptote R 1,∞ was fixed at 1/4.7 s −1 , equal to the R 1 of pure deoxygenated water at 310 K at high field [17]: a slightly higher value would be more appropriate if R 1 values from liver water and pure water do not converge as illustrated in Fig. 1.
The relative magnitude of the major variance components was estimated. This is very crude, and given the heterogeneity and variable quality of the raw data, should be considered a rough guide only. The within-study between-subject CoV reflects not only repeatability error (~ 1% of the variance), but also the expected between-subject variation (~ 25% of the variance). Between-study variation (~ 74% of the variance) also includes between-population variation, together with bias from interactions between each study's measurement method and its livers' variation in flow, motion, fat, oedema, collagen, glycogen and iron. R 1 may also change after a meal [89], during the menstrual cycle [25] or with drug treatment [25].
The literature survey was not fully PRISMA-compliant [130] and is unlikely to be complete. Studies explicitly of liver R 1 or T 1 as a biomarker are readily retrieved, because Within-study between-subject coefficient of variation as a function of year of publication appropriate keywords are generally used in the title and abstract. However, for studies where liver R 1 or T 1 measurement is incidental to another objective, for example extracellular volume, relaxivity, or dynamic contrast-enhanced studies, suitable keywords may not have been included.
There is no single "correct" value for any liver's 1 H R 1 . R 1 may vary spatially across the liver [60, 119]. Water 1 H R 1 is multiexponential, particularly with sequences where macromolecule-associated fast-relaxing water contributes to the measurement. Other substances in the liver may also contribute to the 1 H signal, such as glycogen [87] or triglyceride [76,131]. Inflowing blood [110,132], physiologic motion [71], magnetization transfer, and iron affect the measured R 1 in ways which depend both on the sequence and on the analysis employed. There may be systematic differences in R 1 between fat-suppressed vs. non-fat-suppressed acquisitions; 2D acquisitions more vulnerable to inflow effects than 3D; breathhold or gated vs. free-breathing; and so on. Some investigators advocate the use of a "corrected" T 1 to avoid bias caused by the relaxivity of iron-containing substances [65]. Because of these biases in the literature, studies which deviate from these survey data should not immediately be considered "incorrect", but if large deviations are observed, then an explanation on methodological or physiological grounds should be sought.
There are some other limitations. While some publications reported carefully designed and conducted biomarker validation studies, in other publications, the precise value of T 1 was only of incidental interest and possibly acquired with less care. However, in this survey, the study design and objectives were not incorporated into the weightings. Most studies did not report validation of their liver R 1 by means of a phantom, so accuracy is unknown. It was difficult to explore the effect of methodology on R 1 , because some studies used methodology which was poorly described or did not appear robust, and because of correlation between field strength and methodology (old studies used old methodology and lower fields). Likewise, there was correlation between field strength and species (humans at low-medium fields, rats at medium-high fields and mice at high fields), so it was difficult to compare between species.

Conclusion
Quantitative relaxometry requires validation with phantoms and analysis of propagation of errors. However, it is also good scientific practice to compare one's own findings with prior literature. An investigator who finds their average liver R 1 in normal liver to be within 8% of the fit to Eq. 3, with between-subject CoV < 8%, can conclude that their measurements are in agreement with the majority of the literature: for measurements far outside these limits, a physiological or methodological explanation should be sought.

Acknowledgements
The research leading to these results received funding from the Innovative Medicines Initiatives 2 Joint Undertaking under grant agreement No 116106 (IB4SD-TRISTAN). This Joint Undertaking receives support from the European Union's Horizon 2020 research and innovation programme and EFPIA.
Author contributions Waterton, JC. Study conception and design, acquisition of data, analysis and interpretation of data, drafting of manuscript and critical revision.

Declarations
Conflict of interest John Waterton holds stock in Quantitative Imaging Ltd and is a Director of, and has received compensation from, Bioxydyn Ltd, a for-profit company engaged in the discovery and development of MR biomarkers and the provision of imaging biomarker services.

Research involving human and animal participants Not applicable, as this is a survey of previously published research.
Informed consent Not applicable, as this is a survey of previously published research.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.