The heart rate method for estimating oxygen uptake: analyses of reproducibility using a range of heart rates from commuter walking

Background The heart rate method, based on the linear relation between heart rate and oxygen uptake, is potentially valuable to monitor intensity levels of physical activities. However, this depends not least on its reproducibility under standard conditions. This study aims, therefore, to evaluate the reproducibility of the heart rate method in the laboratory using a range of heart rates associated with walking commuting. Methods On two different days, heart rate and oxygen uptake measurements were made during three submaximal (model 1) and a maximal exercise intensity (model 2) on a cycle ergometer in the laboratory. 14 habitual walking commuters participated. The reproducibility, based on the regression equations from test and retest and using three levels of heart rate from the walking commuting, was analyzed. Differences between the two models were also analyzed. Results For both models, there were no significant differences between test and retest in the constituents of the regression equations (y intercept, slope and r value). Neither were there any systematic differences in estimated absolute levels of VO2 between test and retest for either model. However, some rather large individual differences were seen in both models. Furthermore, no significant differences were seen between the two models in slopes, intercepts and r values of the regression equations or in the estimated VO2. Conclusion The heart rate method shows good reproducibility on the group level in estimating oxygen consumption from heart rate–oxygen uptake relations in the laboratory, and based on three levels of heart rate which are representative for walking commuting.


Introduction
To monitor metabolic demands and physiological work intensities of physical activities in free-living field conditions is of great value in both physical and health education, promotion, surveillance and research. For that purpose a number of small, lightweight and portable instruments for indirect calorimetric measurements have been developed. However, they are costly, technically complicated, and can be sensitive to ambient conditions (MacFarlane 2017; Salier Eriksson et al. 2012;Schantz et al. 2018), which makes them difficult to use in a large scale in research as well as in educational contexts. Furthermore, relevant methodological evaluations of them in laboratory (cf. Rosdahl et al. 2010) Communicated by Peter Krustrup. or in field conditions (Salier Eriksson et al. 2012;Schantz et al. 2018) are rare.
This motivates a renewed interest in the heart rate method (HR method). It is based on a linear relationship between heart rate (HR) and work rate/oxygen uptake (VO 2 ) during exercise, as described during the first half of the twentieth century (Boothby 1915;Krogh and Lindhard 1917;Hohwü Christensen 1931;Berggren and Hohwü Christensen 1950). HR recordings from various physical activities have since then been used in numerous studies as a basis for interpreting energy requirements and exercise intensities in humans (e.g., Bradfield 1971;Åstrand 1971;cf. Montoye et al. 1996;Achten and Jeukendrup 2003;Shephard and Aoyagi 2012) as well as in animals (cf. Green 2011). The value of such measurements is greater if individual HR-VO 2 relations are established (cf. Montoye et al. 1996, p. 103), which today is facilitated by portable heart rate recorders and automatized stationary metabolic measurement devices. Furthermore, the relation between standardized work rates on ergometer cycles and VO 2 can, due to a small interindividual variability in mechanical efficiency (for 2/3 of both male and female subjects within 6%, according to Åstrand and Rhyming 1954), be used as a substitute for measuring VO 2 (cf. Åstrand and Ryhming 1954;Åstrand 1960, if taking into account that body weight affects these energy demands at standardized work rates Berry et al. 1993;Lafortuna et al. 2008;Björkman 2017). That enables the heart rate method to also be used for purposes such as health education and promotion in which the exact levels of VO 2 are not necessary to establish. In that way, different physical activities can be related on an individual basis in terms of, e.g., energy demands (kilocalories/-joules) and intensity levels (metabolic equivalents of task), and interpreted in relation to dose of physical activity and effects on, e.g., health outcomes (cf. Paffenbarger et al. 1986;Hu et al. 1999).
However, the mentioned practice of using a method is one thing, validity and reproducibility is another. Already Berggren and Hohwü Christensen (1950) stated that the HR method must be used "with great care" since the HR "can vary independent of metabolic rate." There are a number of issues related to validity of the HR method, e.g., the external validity of the HR-VO 2 relations from laboratory to field settings, and to various types of physical activities, as well as with different durations and ambient conditions that need to be studied in their own rights. Here we instead focus on the fundamental need of evaluating the reproducibility of the HR method under controlled laboratory conditions and to, in relation to previous studies, further the methodological approaches used.
Studies have indicated that the HR response to a repeated standardized cycle ergometer work rate may sometimes be stable and sometimes vary (cf. Montoye et al. 1996, p. 101).
One reason for non-stability is a habituation effect of varying magnitude, but leading to a lower pulse rate at a given submaximal work rate. To our knowledge, this is described for the first time by Per-Olof Åstrand in his doctoral thesis (Åstrand 1952, p. 20), and the direction of this effect was later clarified by him in an interview (Eriksson and Larsson 2001). As a consequence, Åstrand did not make use of his first test results for the thesis, and neither was the Åstrand-Ryhming ergometer cycle test (Åstrand and Ryhming 1954) based on the first tests, but on the result of the second test (Eriksson and Larsson 2001, p. 17). The habituation effect has later been stated by Åstrand (1976), and noted by Ekblom-Bak et al. (2014). Another reason for instability in the HR response is a non-systematic day to day variability (Berggren and Hohwü Christensen 1950;cf. Montoye et al. 1996;Achten and Jeukendrup 2003). Whereas habituation effects can be handled through pretest trials, a day to day variability is more difficult to circumvent, and can jeopardize the reproducibility of the HR-VO 2 relation under controlled laboratory conditions. In such case, the extent to which we can rely on measurements in the laboratory for interpretation in field condition will indeed also be hampered.
HR-VO 2 relations during physical activity in the laboratory can be established through multiple submaximal and maximal work rates, and the pairs of HR and VO 2 data used to calculate a linear regression equation. It is thereby relevant to evaluate the reproducibility of the HR-VO 2 method on the basis of the equations, as well as the outcomes of them using different levels of HR to estimate VO 2 . Surprisingly enough, such evaluations in humans, have, to our knowledge, only been the focus of two studies (Christensen et al. 1983;McCrory et al. 1997). Both of them used a single HR level for their evaluations of the outcomes. One of the studies was dominated by patients with different clinical disorders (Christensen et al. 1983). In it, the HR-VO 2 relation was established from rest to low and intermediate work rates of walking and ergometer cycling. A great variability in the outcomes, based on rather low heart rate values from a 24-h registration, led the authors to conclude that the "applied procedure seems unsuitable for metabolic studies in individual patients who engage in ordinary daily activities with low energy expenditure" (Christensen et al. 1983). McCrory et al. (1997 studied the reproducibility in healthy subjects. Two different HR-VO 2 relations were established based on measurements from resting to walking. Their single-point HR evaluation was also based on HR recordings from a normal day (ca. 15 h). In the HR-VO 2 relation which was based on solely walking, a good reproducibility was noted on the group level, whereas a certain variability was noted in the individual levels.
The conflicting results, and evaluations based on only one, and rather low levels of HR, together with the paucity of critical evaluations of the HR-VO 2 method, prompted us 1 3 to scrutinize these matters further. Methodological issues that need to be addressed relate to the degree of reproducibility possibly varying within one and the same study depending on the levels of HR being used for the evaluations (Fig. 1). If, for example, regression equation slopes from test and retest cross each other, an excellent reproducibility will be attained at that point. However, on both sides of it, the absolute differences in estimated VO 2 will increase, but in different directions. A great number of other potential interrelations between dual regression slopes and y intercepts can produce a substantial variation in the test-retest variability. The magnitude of those differences may, however, be of no importance if they occur outside of the range of HR of interest. The reproducibility of VO 2 estimations, based on HR-VO 2 relations, needs, therefore, to be studied at several levels of HR that are distributed along a relevant range of HR.
Another factor that is likely to determine the degree of reproducibility is the number and span of work rates that are used to establish the HR-VO 2 relations. To enable systematic studies of these matters, it is, therefore, important to specify the levels of HR used in terms of both absolute levels and percentages of maximal heart rate (Londeree and Ames 1976) as well as the heart rate reserve (HRR) (Karvonen et al. 1957;Swain and Leutholtz 1997). The corresponding levels of VO 2 and their percentage of the maximal oxygen uptake are also of value to describe (Fig. 1).
Walking commuting has been stated to occur at lower heart rates and relative exercise intensities than cycling (Oja et al. 1991) which, given the background above, may affect the reproducibility of the heart rate method. The aim of this study was, therefore, to evaluate day-to-day reproducibility of HR-VO 2 regression equations and the estimated oxygen uptakes based on three levels of heart rates representative for everyday walking commuting. Two HR-VO 2 relations were established and compared, one with three levels of submaximal exercise (model 1), and another which also included a maximal exercise (model 2). Model 2 has previously been used in research studies (e.g., Åstrand 1971;Schantz 1980;Schantz et al. 1983). Model 1 has been extensively used in health education at our university college, and has the advantage of not including a maximal exercise test. The HR-VO 2 relations were attained on an ergometer cycle in the laboratory in healthy and physically active middle-aged male and female walking commuters.

Participants
Approval to conduct the study was obtained from the Ethics Committee North of the Karolinska Institute at the Karolinska Hospital (Dnr 03-637), Stockholm, Sweden.

Recruitment of participants
The process of selecting participants was divided into several steps. It started with recruitment through advertisements in two large morning newspapers in Stockholm in May and June of requesting participants. The inclusion criteria required being at least 20 years old; living in the county of Stockholm (excluding the municipality of Norrtälje); walking or cycling the whole way, any distance, between home and to one's work or place of study, and actively commuting in that fashion at least once a year. Answers could be sent in cost-free by post, fax, e-mail or by phone. These advertisements resulted in 2148 people volunteering to take part.
A questionnaire (The Physically Active Commuting in Greater Stockholm Questionnaire 1; PACS Q1) was sent home to these volunteers; 2010 were returned after three reminders. The questionnaire consisted of 35 questions, but only the questions relevant to selecting our population were used in this study. These included gender, age, how physically strenuous their professional jobs were, commuting frequencies per week for each month of the year and commuting duration. The commuting distance of each individual was also used for selecting the study group. These were measured on routes drawn in maps by each respondent. The mapped route distance measuring method is described Fig. 1 An illustration of how a variability in HR-VO 2 relations at test and retest can affect measures of reproducibility. Linear relations from regression equations, based on values from three submaximal work rates at test and retest, are illustrated as unbroken lines. Based on different HR and the regression equations, the estimated levels of VO 2 can be higher, equal or lower at test compared to retest (see broken lines) 1 3 in detail in Schantz and Stigell (2009). From the answers from PACS Q1, the respondents were divided into categories based on their reported mode of either cycling or walking, or combining both modes.
Our sample was selected from the pedestrian category, i.e., those subjects who only walked to work. Other criteria were that they had ages and route distances close to the median values of the male and female pedestrians, respectively (cf. Stigell and Schantz 2015). They also rated their daily professional jobs as light or very light physically. Letters describing the physiological studies and test procedures were sent to the male and female pedestrians who fulfilled the criteria.
The recipients of the letter were first asked if their previously drawn route was still valid, or of a comparable distance time wise (comparably defined as plus/min 5-10 min). They then answered a health declaration concerning (1) medication and for which kind of illness, (2) if they had any palpitations, chest pain or abnormally heavy breathing during exercise, (3) if they had high blood pressure, and (4) if they had recently avoided or discontinued exercise for reasons of injury or health. The letter emphasized the right to terminate the tests at any time and without having to stipulate a reason. A signed informed consent of participation was returned.
Based on this information, individuals with invalid route distances as well as with high blood pressure or on medication that could affect normal heart rate were excluded. Anyone on medication with risks for strong side effects was also excluded. We contacted the remaining pedestrians by telephone to answer any potential questions, and to book test times. Telephone contacts continued until we had seven men and seven women who fulfilled the criteria and were willing to participate (Table 1).

Stationary metabolic gas analysis system
A stationary metabolic gas analysis system, the Oxycon Pro ® (Carefusion GmbH, Hoechberg, Germany) was used in the mixing chamber mode for all metabolic measurements in the laboratory. The software used was JLAB 4.53. In this system, the concentration of oxygen is measured by a paramagnetic analyzer and the carbon dioxide concentration by an infra-red analyzer. The expired air is sampled continuously from the mixing chamber through a Nafion tubing on the outside of the equipment that connects to a Nafion tubing on the inside of the equipment and that terminates at the analyzer inlets. Ventilation is measured through a digital volume transducer which is attached to the outlet of the mixing chamber. The equipment was switched on 30 min before data collection and calibrated before and after each test using the built-in automated procedures and according to the manufacturer's recommendations. The ambient conditions were first recorded, followed by calibration of the volume sensor and the gas analyzers. A high-precision gas of 15.00% O 2 and 6.00% CO 2 (accuracy O 2 ± 0.04% rel. and CO 2 ± 0.1% rel. Air Liquide AB, Kungsängen, Sweden) was used for calibration.
A face mask with non-rebreathing air inlet valves (Combitox, Dräger Safety, Lübeck, Germany) was used. It was carefully fitted on the subject and checked for air leakage immediately prior to the measurements by the investigator and adjusted until no leakage occurred. For several subjects, a rubber insert was taped inside the top of the mask to prevent air leakage from the bridge of the nose. A tube (inner diameter of 35 mm) attached to the mask led the expired air into the mixing chamber. The measured variables were exported to Excel for further processing.

Ergometer cycle
A manually braked pendulum ergometer cycle (828E Monark Exercise AB, Vansbro, Sweden) was used. Before each experiment, the scale was zeroed while each subject sat on the saddle with his or her feet resting on the frame between the pedals, and hands resting on the handle bars. The saddle height was adjusted so that the participant's knees were slightly flexed when the feet were on the pedals in their lowest position. The handle bars were adjusted to allow the participants to sit in an upright position. A digital metronome (DM70 Seiko S-Yard Co. Ltd, Tokyo, Japan) helped the subjects maintain the correct cadence while cycling. The work rate was controlled every minute by checking the cadence of the participant and the braking force as indicated on the pendulum scale.

Heart rate
During the resting period and the exercise protocol, HR was measured using a Polar Electro S610i Heart Rate Monitor, with a Polar Wearlink 31 transmitter (Polar Electro Oy, Kempele, Finland).

Laboratory tests, standardization procedures and rest conditions
The walking commuters were tested in the laboratory at rest, and submaximal as well as maximal work rates on two different occasions, which were completed within an average of 8.0 ± 4.7 days. Two trained investigators carried out the laboratory tests, each participant having the same investigator for each test. The participants were not able to drink during any of the tests.
The pedestrians were asked to follow the same standard procedures before each test occasion. These were (1) not to engage in any vigorous exercise for 24 h beforehand, (2) not to cycle to the laboratory, (3) to refrain from eating, drinking, smoking and taking snuff (smokeless tobacco) for at least 1 h before arrival at the laboratory, (4) not to eat a large meal at least 3 h before the tests, (5) to avoid stress and (6) to cancel the test if they had fever, an infection or a cold. The time of the day that the tests were undertaken was not standardized since it does not affect the HR-VO 2 relation during physical activity (McCrory et al. 1997). The participants wore light clothes, such as T-shirts, shorts and training shoes, so as to diminish any effect of the energy liberation from the submaximal exercises on sweating and body temperature.
On arrival at the laboratory, the participants were weighed and measured, and a check list was ticked off to determine if they had followed the standard procedures named above. A Polar heart rate monitor and a Wearlink were then placed on a wrist and around the chest, respectively. The participants rested quietly for 10 min on a treatment table. Resting heart rate, calculated from the time period between every single heart rate, was determined from the average of the 5 min between the 6th and 10th min.

Cycle ergometer exercise protocol
The participants cycled at three different work rates: 50, 100 and 150 W for the women, and 100, 150 and 200 W for the men. A cadence of 50 revolutions per minute was chosen (Åstrand 1952, p. 19). At each work rate, the participant cycled until steady state (approximately 6 min), after which the resistance was increased. The third work rate was increased to only 125 W or 175 W for women and men, respectively, if, after the second work rate, the subject's HR was higher than 150 beats per minute and their perceived rate of exertion exceeded 15 for both legs and breathing (Borg 1998, p. 30). The HR from the Polar heart rate monitor, with the HR averaged for every 15 s, and RPE was noted in the protocol after every minute.
Between the second and third work rates, the test person continued cycling for 1 min at a self-chosen low cadence with a resistance of 5 N. The subject was then instructed to resume the cadence of 50 rpm while the investigator slowly increased the work rate until, after 1 min, the third work rate was reached. For that purpose, resistance was increased to 50 W during the first 15 s, to 100 W the second 15 s and successively to the required work rate during the last 30 s. After the third submaximal test, the subject continued cycling for 2 min at a self-chosen low cadence at 5 N.
During the maximal exercise phase, the subjects cycled at a cadence of 80 rpm (Foss and Hallén 2004). For the first three minutes, the work rates were 60, 100, and 120 or 140 W for 1 min each. The latter alternatives depended on which third work rate the subjects had during the submaximal work: 120 W if the third submaximal work rate had been 125 W or 175 W for women and men, respectively; 140 W if it had been 150 W or 200 W for women and men, respectively. The work rate increased thereafter by 20 W every 60 s. The test continued until exhaustion. HR was noted before each increase of the resistance and also at the moment when the participant terminated the test because of exhaustion.
To assess the rating of perceived exertion (RPE), the Borg scale, as mentioned before, was used (Borg 1998, p.30). The subjects were instructed on how to use the scale before commencing the tests. They were asked to point to a number on the scale that corresponded to their feeling of exertion for breathing and in their legs, respectively, before every increase of resistance during the submaximal test and directly after the maximal test. During the maximal phase they continued until exhaustion. To ensure that each subject achieved maximal exertion, at least two of the following three criteria were met by each subject: (1) a plateau in VO 2 despite increasing exercise intensity (defined as a VO2 increment of less than 150 ml), (2) a respiratory exchange ratio of ≥ 1.1, and (3) a rating of RPE of ≥ 17 on the Borg scale (Borg 1970;Howley et al. 1995;Midgley et al. 2007).
The laboratory tests were completed within an average of 18.9 ± 17.7 days (23.7 ± 21.5 days for men; 14.1 ± 12.2 days for women).

Measurements of heart rate during commuter walking
The participants commuted either to or from their workplace choosing themselves which time was most convenient. 18 of the pedestrians (nine men; nine women) were tested in the morning (start times between 07:26 h and 10:15 h) and the remaining two women were tested after work (start times 16:47 h and 16:48 h). The field trips took place in the inner urban and suburban-rural areas of Stockholm, Sweden. A detailed description of these areas can be found in Wahlgren and Schantz (2011). The majority of the pedestrians walked at least partly within the inner urban area (cf. Table 1). They were met at the designated address by one of the investigators, who checked that the pretest standardization procedures, as described above, had been followed. The participants were instructed to walk at their normal pace. HR was measured using the Polar electro heart rate recorder and the HR was averaged for every 15 s. The starting time of the walking trip was synchronized with the second investigator waiting at the destination. The participants were not able to drink during their walk commute.
On arrival at the destination, the total trip time was noted and the walker was asked to rate his perceived rate of exertion for both breathing and legs. They were asked how many stops that were made at traffic lights as well as other stops, and marked them on maps with their routes. The participants were also asked to confirm whether that route had been taken the whole way, and if not, any deviation from the originally marked route was added to the map.

Analytical approach and statistical analyses
For determining the resting HR, the values are based on each single time period between the heart beats, which were transformed into heart rates per minute and averaged for a 5-min period.
For the submaximal tests in the laboratory, the mean of the four 15-s values for VO 2 and HR for the last minute of each load was used for analysis. The values for the maximal tests were calculated by averaging the highest four 15-s consecutive values for VO 2 and HR at maximal exercise, i.e., a collection period of 60 s (Howley et al. 1995). The same corresponding values were used for both VO 2 and HR.
The reproducibility of the paired individual data for VO 2 and HR between test and retest in the laboratory was calculated as absolute and relative differences, and analyzed with Wilcoxon's signed rank test and Student's paired t test as well as coefficient of variation (CV). The CV was calculated by dividing the standard deviation of the difference between the test-retest values by √2. This value (typical error) was then divided with the average of the test-retest values and multiplied by 100 (Hopkins 2000). With regard to significance, the same results were obtained with the nonparametric Wilcoxon's signed rank test and the parametric Student's paired t test. In the further statements with regard to analyses undertaken, and in "Results", we do, however, only report the values obtained from the Student's paired t test.
The HR-VO2 relations based on each individual´s paired VO 2 and HR from three submaximal work rates (model 1) plus a maximal work rate (model 2) at test and retest were described by linear regression analyses and correlation coefficients (r values). The absolute differences in y intercepts, slopes and r values between test and retest were evaluated with paired Student's t test for each model. The absolute values for the y intercepts, slopes and r values at test and retest were also compared between models 1 and 2 with paired Student's t test, and the 95% confidence intervals for the mean values were also calculated.
The reproducibility of the estimated VO 2 , based on the regression equations from test and retest, and calculated on the basis of three levels of HR from each individual's cycle commuting, was calculated as absolute and relative differences. They were analyzed for all individuals with Student's paired t test, the 95% confidence intervals for the mean values and coefficient of variation (CV).
Whether the levels of estimated VO 2 at test and retest, as well as the differences between test and retest, were altered between models 1 and 2 was also evaluated with Student's paired t test and 95% confidence intervals for the mean values. Bland-Altman plots with 95% limits of agreement in individual absolute values of estimated VO 2 were graphically displayed (Bland and Altman 1986).
Statistical analyses were performed using the Statistical Package for the Social Sciences (SPSS, 17.0, Chicago, IL, USA). The Bland-Altman plots were created with Graph-Pad Prism ® 4 software package (Graph-Pad Software Inc., San Diego, CA, USA). Values are presented as mean ± SD unless otherwise stated. The significance level was set at p < 0.05 when data were used only once, and when data were used twice, a Bonferroni correction of the level of significance was done resulting in p < 0.025.

Reproducibility of repeated single measurements
There were no systematic absolute or relative differences in VO 2 and HR between the first and second measurement occasions in the laboratory (Table 2).

Positioning work rates for the HR-VO 2 relations in the laboratory
The three submaximal work rates, used in both models of HR-VO 2 regression equations, induced mean levels of HR ranging from on average 117 ± 15 to 157 ± 11 beats per minute for the males, and from 98 ± 9 to 154 ± 7 for the females (Table 2). For maximal HR and other descriptive aspects of the work rates used, see Tables 2 and 3.

Positioning the HR from the commuter walking used for estimating levels of VO 2
The mean values of the lowest, middle and highest fifth of heart rates during each participants' walking commute were used to estimate the corresponding level of VO 2 based on the HR-VO 2 regression equations. These heart rate segments were determined through ordering all heart rates from the lowest to the highest, and then dividing them into segments of 1/5 of all heart rates. It is the segments with the 20% lowest, intermediate and highest heart rates that are described in Table 4. The range of mean HR levels was between 104 ± 17 and 124 ± 20 beats per minute for the males, and from 103 ± 8 to 128 ± 10 for the females. For further details on the responses to the work rates used, see Table 4.

Reproducibility of HR-VO 2 regression equations and estimated levels of VO 2 (model 1)
The test and retest HR-VO 2 regression equations and estimated levels of oxygen uptake from three levels of HR are presented in Tables 5 and 6. There were no indications of differences in y intercept, slope or r value in the regression equations at the retest compared to the test (Table 5). Based on calculations of all subjects, there were no systematic differences in estimated absolute levels of VO 2 between test and retest ( Table 6). The relative differences between test and retest were 1.91 ± 11.1, 1.71 ± 8.36 and 1.57 ± 7.34% (all n.s.) based on estimations from the lowest to the highest levels of HR. The 95% limits of agreement for the individual variations in the differences in estimated VO 2 between test and retest varied between − 0.2775 and 0.2518 (L min −1 ) for the low HR, − 0.2436 and 0.2122 for the middle HR, and − 0.2352 and 0.2008 for the high HR (Fig. 2).

Reproducibility of HR-VO 2 regression equations and estimated levels of VO 2 (model 2)
The test and retest HR-VO 2 regression equations and estimated levels of VO 2 from three levels of HR are presented in Tables 7 and 8. There were no significant differences between test and retest in the constituents of the regression equations (y intercept, slope and r value) ( Table 7). Based on calculations of all subjects, there were no systematic differences in estimated absolute levels of VO 2 between test and retest. The relative differences between test and retest, based on estimations from three different levels of HR, were 3.12 ± 8.25, 3.13 ± 7.60 and 2.81 ± 7.47% (all n.s.) ( Table 8). The 95% limits of agreement for the individual variations in the differences in estimated VO 2 between test and retest varied between − 0.1940 and 0.1428) (L · min −1 ) for the low HR, − 0.2163 and 0.1478 for the middle HR, and − 0.2437 and 0.1737 for the high HR (Fig. 3). Table 2 Test-retest of VO 2 and HR at rest, submaximal and maximal cycle ergometer tests in the laboratory (mean ± SD) and coefficient of variation (CV) The p values are based on the paired differences in absolute values and percent differences Males (n = 7)  .9 56.5 ± 11.2 Percent of maximal heart rate 56.8 ± 9.0 63.4 ± 10.2 68.0 ± 9.6 57.8 ± 3.5 64.0 ± 4.3 72.0 ± 6.5  Table 6 The estimated levels of VO 2 based on the regression equations from

Comparisons of regression equations and estimated VO 2 between the HR-VO 2 relations in models 1 and 2
The differences between the two HR-VO 2 models in the y intercept, slope, r value as well as in the three levels of estimated VO 2 at test and retest were compared for all subjects. All differences between the models were small and non-significant. The mean absolute and relative differences varied from 0.01 ± 0.08 to 0.02 ± 0.05 L/min (all n.s.) and 0.21 ± 6.89 to 1.42 ± 3.91% (all n.s.), respectively (Tables 9  and 10).

Discussion
An important feature of this study is that we have developed a transparent framework for analyses of the reproducibility of the HR method in laboratory conditions. It is characterized by positioning all HR values used in relation to both resting and maximal HR. This relates to both the HR-VO 2 relations that were established in the laboratory, and the evaluation of them with three relevant HR levels that were obtained from walking commuting in field conditions. In this way, the relative localization of the measurement points of HRs used is clarified in a way that can be reproduced, and compared with future studies of these matters.
The main finding of this study is that there were no significant differences between test and retest in the constituents of the regression equations (y intercept, slope and r value) in model 1 and model 2. In line with that, there were no systematic differences in estimated absolute mean levels of VO 2 between test and retest for either model. The relative differences between test and retest, based on estimations from three different levels of HR, were 1.91 ± 11.1, 1.71 ± 8.36 and 1.57 ± 7.34% (all n.s.) in model 1, and 3.12 ± 8.25, 3.13 ± 7.60 and 2.81 ± 7.47% (all n.s.) in model 2. However, some large individual differences were seen in both models, as indicated by the range of standard deviations for the relative differences (7.34-11.1%). Consequently, the 95% confidence intervals for the mean values of all subjects indicate variations of between about 8 and 14% for the three different estimations of relative differences in VO 2 between test and retest. This spreading is further illustrated in the individual differences between test and retest, and the given 95% limits of agreements which are also illustrated in Figs. 2 and 3.
Another important finding was that there were no significant differences between models 1 and 2 in the constituents of the regression equations (y intercept, slope and r value) or in any of the outcome variables (estimated levels of oxygen uptake, or in the absolute or relative differences between test and retest). Model 1 Fig. 2 Individual levels of differences and the 95% limits of agreement between estimations of VO 2 (L · min −1 ) calculated from fixed levels of low, middle and high HR from commuter walking as well as from repeated measurements of VO 2 -HR relations based on three submaximal work rates (Model 1). The y axes show absolute differences in VO 2 against the mean values of the estimations from the repeated measurements on the x axes An overall pattern of stability was discerned on the group levels between test and retest in VO 2 and HR, which permits the present test-retest analyses of the outcomes of the HR-VO 2 regression equations. The fact that we started the measurements with 15 min of rest in a supine position, and that all subjects were very physically active (cf. Table 1) can be relevant for this outcome. It should, at the same time, be kept in mind that habituation effects in HR with repeated measurements have been noted in studies of samples from the general population (Åstrand 1952;Åstrand 1976;Ekblom Bak et al. 2014). As a safeguard, a habituating pretest, as was applied by McCrory et al. (1997), is, therefore, recommended as a standard procedure.
A non-systematic test-retest variability in HR was also noted by McCrory et al. (1997). They even controlled for sex and some individual factors, such as being an emotional person, without being able to see any such relations. Berggren and Hohwü Christensen (1950) studied the HR-VO 2 relation repeatedly in one person and found a variability in the heart rate of the same order of absolute magnitude in work rates demanding between 1 and 4 L of VO 2 . Thus, it is possible that this variability stands for an intrinsic feature of repeated HR-VO 2 spot measures. And in many ways, this is reasonable since VO 2 , according to the Fick principle, is the product of heart rate, stroke volume and the difference between arterial and mixed venous oxygen content. Thus, levels of four different variables can, in principle, vary in response to a work rate, and still the resulting VO 2 can be the same. From that perspective, it is not surprising that the HR may vary from time to time at a given exercise-induced VO 2 . It indicates that the biological steering mechanisms for these variables are not strictly controlled.
In individual cases, linearity between HR and VO 2 has been indicated to sometimes end at near to maximal VO 2 levels, with greater increases in VO 2max than in HR (Hohwü Christensen 1931;Davies 1968;Åstrand and Rodahl 1970, p 352, 414). Given that, it could be questionable to include values on maximal HR and VO 2 , as has been done in model 2 in our study, and thus it could be anticipated that the regression equations and outcomes of models 1 and 2 might differ. Including maximal HR and VO 2 could, on the other hand, serve as an anchor, stabilizing effects of day to day variability of the regression equations that otherwise could come into play. One reason for such a role for HR values from maximal work rate is its low CV (Table 2) in comparison with those at the submaximal work rates. The fact that we did not see any significant differences between the outcomes in models 1 and 2 indicates the potential value of educational models that do not include measurements from maximal work rates. Furthermore, it indicates that research models for establishing the HR-VO 2 relation may be adequate without maximal measurements. To include more submaximal measurements than the three that we have used, might, however, be a beneficial way to create even greater day to day stability in models based on only submaximal work rates, and deserve future studies. One reason for the good reproducibility on the group level for model 1, despite only making use of three submaximal work rates, can be the span of the HR attained between work rates 1 and 3 (in average 37-56 beats per minute for males and females, respectively). It is important that the utilized range of HR from walking commuting is within the range of the HR attained from the three submaximal ergometer cycle work rates in the laboratory, which is the case for the females (cf. Tables 2 and 4). If instead VO 2 were to be estimated from higher or lower HR than established from the submaximal work rates in the laboratory, it is possible that greater test-retest differences would have been observed (cf. Figure 1).
A comment on the field heart rates used is that the walking commuting HR were monitored in the morning and evening at slightly varying times. Given that the standardization criteria were followed, these times of HR collections are representative for walking commuting since the HR-VO 2 relation during physical activity is not affected by the time of the day (McCrory et al. 1997).
Another comment favoring a stability in the measurement conditions is that the mean values for the positions of per cent HR max used to establish the HR-VO 2 relations related well to the expected VO 2 relative to VO 2max in both sexes (Table 3) (Londoree and Ames 1976).
Our results are in line with those of McCrory et al. (1997), and considerably more favorable in relation to using the HR-VO 2 method than those indicated by Christensen et al. (1983). There are several explanations for that. The measures used by Christensen et al. (1983) for establishing HR-VO 2 regression equations were resting and sitting, as well as three low to intermediate exercise rates on an ergometer cycle (8-100 W) and three exercise rates on a treadmill, thus altogether eight measurement points. For both the slope and the y intercept of the regression equations, the measurements at low levels of HR are, under those circumstances, more influential. At the same time, it is well known that the HR-VO 2 ratios at rest and sitting are quite unstable, resulting in variations in regression equations (Booyens and Hervey 1960;Montoye et al. 1996, p. 103;McCrory et al. 1997). Between very low intensities of exercise and Individual levels of differences and the 95% limits of agreement between estimations of VO 2 (L · min −1 ) calculated from fixed levels of low, middle and high HR from commuter walking as well as from repeated measurements of VO 2 -HR relations based on three submaximal work rates (Model 2). The y axes show absolute differences in VO 2 against the mean values of the estimations from the repeated measurements on the x axes 1 3 rest, the slope of the HR-VO 2 linear relationship will have a different angle compared to higher intensities (Achten and Jeukendrup 2003;Booyens and Harvey 1960;Luke et al. 1997;Spurr et al. 1988), which could be another reason for the results of Christensen et al. (1983). Furthermore, mixing the work forms on cycle ergometer and treadmill as the bases for the HR-VO 2 measures is in itself problematic since HR response for a given VO 2 differs in these different forms of movement (Lafortuna et al. 2008). This creates a greater risk for non-stability in the regression equations with repeated measurements. Finally, the measures of 24 h HR by Christensen et al. (1983) resulted in a mean value of 86 beats per minute. In line with the reasoning in the Introduction (cf. Fig. 1), a HR close to the endpoint of the spectrum of measurement points forming the regression equation will with greater plausibility lead to a lower reproducibility. Another potential explanation for their results relates to their use of a heterogeneous sample of predominantly patients and large variations in age, whereas we studied a sample of healthy and physically active middle-aged individuals. Having stated that, one has to keep in mind that the external validity of our findings in relation to other types of participants is uncertain. Thus, to forward the general knowledge in these respects, there is indeed a need for further studies of these matters. As stated in "Introduction", the studied models 1 and 2 mimic those that have been used at our research and educational setting. In our mind, it would be of great value to further the understanding of if the HR method can be optimized through using more submaximal measuring points when establishing HR-VO 2 relation in laboratory. Berggren and Hohwü-Christensen (1950) undertook a number of HR and VO 2 measurements on one individual, and their results indicate a variation in oxygen pulse for a given oxygen uptake. However, we do not know if such variations occur "within a day", or are the result of "between-day variations". Independent of the cause for them, using, e.g., five or seven submaximal measurements points might stabilize the day to day variation, as compared to using only three points, and systematic studies of this deserve to be undertaken. In approaching the end of this discussion, it is reasonable to also point at the fact that we do not know anything about the external validity of the HR method in the laboratory in relation to field conditions such as during walking commuting. Two studies have looked at the intensity of walking commuting using different HR methods (Kokkonen-Harjula et al. 1988;Oja et al. 1991). However, none of these studies considered that, for reasons such as cardiovascular drift with prolonged work durations (Achten and Jeukendrup 2003;Coyle and Gonzalez-Alonso 2001;Saltin and Stenberg 1964) or stress due to traffic conditions (Carroll et al. 2009;Lambiase et al. 2012), the relationship measured in the laboratory may differ when being in a walking commuting environment, and that consequently the indicated intensity of walking commuting might be incorrect. This will be the focus in our further studies.
We have, as pointed out in the beginning of "Discussion", developed a framework for studying these matters in terms of relating all HRs used to the maximal HR (%maxHR) and the relative position of the HR in between the resting and the max HR (%HRR). In future studies, we do also suggest that the body temperature is monitored since this factor influences the metabolism and may affect the blood flow distribution and thereby also the constituents of the Fick principle, with possible effects on HR-VO 2 relations.
In conclusion, the present study demonstrates good reproducibility on the group level of HR-VO 2 relations established through cycle ergometer exercise with healthy and physically active middle-aged walking commuters as participants in laboratory conditions, and evaluated at three levels of heart rates from walking commuting that represent moderate exercise intensities. 10 Differences between model 1 and model 2 in the estimated VO 2 based on three levels of HR-VO 2 as well as absolute and relative differences between these outcomes (means ± SD, and 95% confidence intervals (CI)) for all values Participant Differences between model 1 and model 2 with regard to outcomes based on HR VO 2 regression equations in estimations of VO 2 based on these levels of HR, and the HR-VO 2 regression equations at day 1 and day 2 and the absolute and relative differences between those outcomes