1 Introduction

To date there have been numerous publications concerning the versatility of metabolomics/metabonomics as a high throughput functional genomic tool for monitoring disease processes, following drug toxicity and phenotyping genetically modified mammals (Raamsdonk et al. 2001; Nicholson et al. 2002; Gavaghan et al. 2000). The ability to generate large multivariate databases using metabolomic approaches has led to the development of a number of databases in the field, particularly in the pharmaceutical industry as part of drug safety assessment (Lindon et al. 2003). In addition in vivo NMR databases have been produced where automated pattern recognition tools have been used to aid radiologists categorize magnetic resonance spectroscopy data of brain tumours according to histological type and grade, for example successfully distinguish meningiomas, low-grade astrocytomas, and “aggressive tumours” (Tate et al. 2003; Griffin and Shockcor 2004 for a review of this area). These applications indicate that central databases for metabolomic data, using information from several different sites, are a realistic possibility and will provide useful resources for the wider research community.

The necessity of good database design for bioinformatics has long since been appreciated in the various genome sequencing projects and for sequence comparisons across species. However, databases for transcriptomics, proteomics and metabolomics pose a further complicating factor compared with genomic databases. While genomics, as applied to gene sequencing and comparisons, is not context dependent, the other “-omes” will produce different profiles according to time, disease state, or interaction with biochemical/chemical/physical stimuli (Oliver 2002). This complicates the construction of any database, necessitating files which report information about the sample and context examined, the experimental procedure and the hypothesis that was being tested as well as information about how the data was acquired and processed. Furthermore, for the database to be easily useable this information must be stored in a uniform format that simplifies any search routine required to retrieve the data, but not provide an extra cumbersome burden to the users who deposit the information.

In this manuscript we (the metabolomics standards initiative-mammalian context working sub-group (MSI-MCWSG)) describe our first draft of a minimum requirement for the description of the biological materials/processes examined in a metabolomic study involving mammalian subjects. The current metabolomic literature in this area includes functional genomic studies, drug toxicology, nutrigenomics, clinical trials and other human studies. It is planned that this will lead to the development of a tool for the description of metabolomic experiments that will enable the storage, retrieval and manipulation of large amounts of data. This will benefit the assessment and dissemination of metabolomic data from mammalian studies.

2 Aim

The aim is to identify, develop and disseminate a core set of reporting requirements necessary for the minimal description of biological samples and procedures particular to mammalian metabolomic experiments.

This effort should be considered within the wider context, of the reporting requirements for all types of biological samples in metabolomics experiments currently being developed by the Metabolomics Society (http://www.metabolomicssociety.org/)–metabolomics standards initiative (MSI). These requirements do not represent an outline of good practise, but instead are aimed at capturing the minimum amount of information to allow others to judge the worth of the data.

3 Scope of this recommendation

We have produced two reporting requirements that represent the minimum requirements for pre-clinical studies (e.g. toxicology, functional genomic experiments, drug efficacy and disease intervention) and clinical studies (e.g. clinical trails, nutritional studies, human disease investigations; http://msi-workgroups.sourceforge.net/bio-metadata/reporting/invivo/). The key distinction between the two reporting requirements is the control that can be placed over the environment that the individuals are found within. The pre-clinical scheme applies to all those experiments conducted within the laboratory where there is tight control over the environmental conditions the animals are housed within. This allows the ready measurement of environmental variables including housing (e.g. group or individual), light cycle, food intake (type and amount) and water consumption. This information is less appropriate and often lacking from clinical trials and studies, which have been conducted ‘in the field.’ While we limited our analysis to human studies for this second analysis there is clear overlap with this description and that currently being discussed by the Environmental subcontext group (http://msi-workgroups.sourceforge.net/bio-metadata/reporting/env/). The analysis has also been limited to mammalian studies, although many of the recommendations for a minimum reporting standard will equally apply to other animal species.

When developing these two reporting requirements there is already a wealth of information on a variety of factors, which influence the metabolome of an animal. Considering laboratory studies, strain of animal has been shown to have a large influence on the metabolome of both biofluids and tissues from rats and mice (Holmes et al. 1998; Jones et al. 2005; Gavaghan-McKee et al. 2006). Gender also has a distinct influence (Bollard et al. 2001; Stanley et al. 2005). Indeed, differentiation of various stages of the estrus cycle has been attributed to alterations in numerous components of the tricarboxylic acid cycle in urine, as well as creatine, creatinine and glucose excretion rates. Diet has far reaching influences on the metabolome, including influencing the composition of the gut microflora present within the animals (Phipps et al. 1998; Robosky et al. 2005). There have also been studies demonstrating that diurnal light/dark cycles have a profound influence on the metabolome of both biofluids and tissue. For example, in rats a clear diurnal alteration to the metabolome was observed and characterised as increases in the concentration of glucose, TMAO and DMG in the overnight urine (Stanley 2002). Finally, without a sufficient acclimation period when switching animals from their normal housing conditions to those used during the collection of urine (e.g. metabowls) alterations to glucose and creatine have been observed to the metabolome, usually in response to a loss of body-weight (within 24–48 h).

The situation in assessing the mammalian metabolome in clinical studies is complex and often information that would be desired cannot be collected easily. Thus, in this scheme a balancing act was performed between information necessary to describe known influences on metabolic profile and a consideration about whether this information could be collected in human studies ranging from highly controlled clinical trials to sampling ‘within the field’ as part of nutritional studies and prospective epidemiology studies (German et al. 2005; Gibney et al. 2005; Walsh et al. 2006). In some human studies the effects of diet and lifestyle have been noted to have greater influences on the urinary metabolomic profile, than that associated with gender (Lenz et al. 2003, 2004). In particular diet has a profound influence (Lenz et al. 2004), although gender differences can also be readily identified in many studies (Kochhar et al. 2006).

4 Diversity of participation

The current membership of the in vivo/mammalian subcontext committee consists of both academic and industrial representation from the metabolomic community. The research areas encompassed include drug toxicology, human clinical trials, nutrigenomics, small-scale patient studies and functional genomic experiments. The committee also represents scientists from both the European and USA metabolomic communities. A limitation of the group was that while we felt that many of the recommendations could be applied to other animal species the current expertise is dominated by scientists who study mammals.

5 The analytical process used to generate the reporting requirements

The sub-committee developed the reporting requirements over a series of telephone conferences and personal meetings (e.g. Metabomeeting 2.0, Cambridge, UK, Jan 2006; The second international meeting of the metabolomics society, Boston, USA, June 2006). As part of the process committee members were asked to provide key publications in the field and decide what information had been captured which should form part of a minimum requirement and what information may be useful but should not be made a requirement. From the toxicology field standard operating protocols (SOPs) were provided from Pfizer, GlaxoSmithKline and Bayer as well as two recent peer reviewed publications from the toxicology field (Mortishire-Smith et al. 2004; Poon et al. 2005). For functional genomic studies to specifically address the influence of the strain background two papers were chosen that demonstrate the profound influence the genetic background has on the metabolic function and phenotype of a genetically modified organism (Hough et al. 2002; Sam et al. 2001). Finally a combination of clinical trials and studies of human metabolism where samples have been taken under less strict control of external variables were also considered (Brindle et al. 2002; Poston et al. 2006; Wolff et al. 2004; Siu et al. 2006; Townsley et al. 2006; Khor et al. 2006; Teahan et al. 2006).

These resultant reporting requirements were then compared with the information captured by other standards initiatives including MIAME (Minimum Information About a Microarray Experiment; Brazma et al. 2001; Quackenbush 2004), (minimum information about a genome sequence (MIGS); http://www.genomics.ceh.ac.uk/genomecatalogue; Field et al. 2006), CEBS and the human proteome organisation-proteomics standard initiative’s (HuPO-PSI; minimum information about a proteomics experiment (MIAPE); Taylor 2006; Orchard et al. 2003; Orchard et al. 2005). Finally, these reporting requirements were compared with previous documents published on standard reporting requirements for a metabolomics experiment and in particular Lindon et al. (2005), which deals extensively with the use of metabolomics/metabonomics in toxicology and a scheme for the description of an NMR based metabolomics experiment (Rubtsov et al. in press). These recommendations were also prepared in consultation with two other organisations: the reporting structure for biological investigations working group (RSBI) (Sansone et al. 2006) and the minimum information about biological and biomedical investigations. This standard checklist has been registered with the MIBI Portal (http://micheck.sourceforge.net/), a ‘one-stop shop’ of extant and in-progress projects with the goal of fostering collaborative development and ultimately, promoting integration.

6 Standard

6.1 Use of ontologies

All ontologies or controlled vocabularies (CVs) we suggest in this document are publicly available resources. Our terminology requirements and recommendations will also be collected by the MSI Ontology Working Group (http://msi-ontology.sourceforge.net/), which is registered under the Open Biomedical Ontologies umbrella (OBO, http://obo.sourceforge.net). For further details see the paper by Sansone and colleagues in this issue.

6.2 Reporting requirements for in vivo/mammalian metabolomics

The following are the reporting requirements developed by the in vivo/mammalian context working sub group for mammalian based metabolomic experiments. This has been split into two subgroups: (i) pre-clinical (e.g. functional genomic and toxicology studies) and (ii) clinical/human studies. The majority of the terms are considered required information, although some terms are in italics and these are recommended rather than required. The structures of the reporting requirements follow the approximate chronological order of this aspect of a metabolomic experiment.

7 Standards for mammalian pre-clinical studies

This pre-clinical category includes many animal experiments in functional genomics, drug discovery, and early stage drug safety assessment where the experimental system can be closely controlled within a laboratory environment. Given the tighter control over experimental conditions for these experiments a more prescriptive minimum requirement has been produced. The scheme is split into four parts: subject description, husbandry, experimental design and sample collection and is shown in Fig. 1.

Fig. 1
figure 1

Standards for mammalian functional genomic and toxicology studies

7.1 Experimental subject description

  • Species/Strain Designation For rat/mouse http://www.informatics.jax.org/mgihome/nomen/strains.shtml

  • Generation of mixed strain

  • Model Description: (if different than Species/Strain.) surgical/pharmacological/ feeding manipulation

  • Animal Supplier (Company/location/colony designation /wild caught )

  • Age range (DOBs ) (as well as age at time of experiment)

  • Weight range (individual weights )

While ‘generation of mixed strains’ is a requested field, a number of physiological studies have found a profound influence on the phenotype of an animal during that study as a result of drift in the strain background.

7.2 Husbandry

7.2.1 Housing

  • group or individual

  • Cage type (shoe box/metabolic/wire mesh, etc)

  • Cage change/cleaning frequency

  • Environmental enrichment

While many of these terms are optional, an important potential source of biological variation in many drug safety assessment studies relying on the use of metabolism cages (metabowls) to collect urine is the stress associated with changing the environment of an animal. Many mice and rats display profound changes in their urinary profiles during the first 24 h in a new environment.

7.2.2 Light cycle

7.2.3 Feed

  • Type/manufacturer (or reference to composition if custom diet)

  • ad lib or restricted diet(e.g. 25 g/day)

  • Diet supplements if any (what treats/how often/how much)

7.2.4 Water

  • Bottle or automated

  • Tap or purified (qualified—e.g. distilled, 18 MΩ, etc )

7.2.5 Veterinary treatments if any and exercise regimen (large animals)

Use of anesthesia (e.g. for blood collection or physicals)

Type of anesthetic(/ formulation/ time of administration/dose of anesthetic)

7.2.6 Acclimation

  • Acclimation duration (to experimental facility

  • Acclimation duration (to diet (if experimental diet differs).

  • Acclimation duration (to metabolic cages (if used).

  • Acclimation duration (to repeat procedures)

7.3 Experimental design

7.3.1 Number of groups

Numbers/gender/group sizes

7.3.2 Inclusion criteria

For example, physical exams or normal metabolomic model screen

7.3.3 Treatments

  • Compound

  • Route

  • Dose

  • Dose volume

  • Duration of dosing

  • Vehicle

7.3.4 Fasting

When relative to metabolomic sample collection and duration of fast in hours

7.3.5 End points

  • Euthanasia method

  • Tissue collection list

  • Tissue processing method (e.g. snap freezing)

  • Clinical signs (time of observation relative to dose )

  • Body weights/food consumption (how often measured)

  • Blood chemistries, haematology, histopathology, special assays

The terms blood chemistry, haematology, histology and special assays may be populated in many studies, but these terms certainly will not be universal, especially in laboratories with limited resources, hence they were not deemed to be required information.

7.4 Metabolomics-related sample collection

7.4.1 Blood

  • Volume collected

  • Location of collection

  • Time of collection (relative to dose and light cycle )

  • Serum or plasma (anticoagulant or presence of serum separator)

  • Storage conditions (temperature, duration)

The metabolic profiles of serum and plasma are markedly different, and there is still much work required to define what metabolites are removed during the clotting procedure involved in serum production. In addition the use of anticoagulants and other additives will also affect the resultant biological matrix and have an impact on the final results produced.

7.4.2 Urine

  • How collected (metabolic cage, cystocentesis, catheterisation)

  • Frequency of collection

  • Duration of collection

  • Time of collection (if less than 24 h) relative to dose and light cycle

  • Bacteriostatic agentor any other additive (final concentration)

  • Urine volume (for 24 h collections)

  • Temperature of urine collection tube (on ice/room temp)

  • Storage conditions (temperature, duration)

7.4.3 Tissues

  • Identification

  • Approximate quantity taken

  • Tissue processing method (e.g. snap freezing, time from kill to snap freezing)

  • Storage conditions (temperature, duration)

8 Standards for mammalian clinical trials and human studies

The ‘clinical’ category includes experiments largely involving human subjects where changes are monitored in a less controlled environment such as occurs in clinical trials, nutritional studies, epidemiological investigations and samples examined from tissue banks. The major change between this scheme and the pre-clinical scheme described above is the removal of many of the environmental/laboratory description terms and an expanded medical history field (Fig. 2).

Fig. 2
figure 2

Standards for mammalian clinical trials and human studies

8.1 Experimental subject description

  • Ethical approval details

  • Geographical location/hospital/ethnic background (based on FDA and Office of National statistics criteria)/demographics

  • Medical History (disease or clinical symptoms; criteria for disease presence (all volunteers should not have factors in their medical history which confound the study). e.g. surgical, pharmacological manipulation, medication (may be referenced)

  • Age range

  • Weight range and Height and/or BMI

  • Gender

  • Trial type (e.g. randomised trial, Disease biomarker, Phase I-IV)

  • Diet and Dietary restrictions (if applicable) and relevant control groups for such dietary restrictions. Diet - standardized, isocaloric, free living subjects,

  • Further descriptors (Smoking, blood pressure, anomalies in habitual diet (e.g. vegetarian, vegan etc.), sporting activity and frequency, habitual alcohol consumption )

The ethical approval is a requested field for many manuscripts. Much of the requested information in this field is aimed at assessing whether any confounding factors may limit the interpretation of the study (e.g. see Kirchenlohr et al. 2006).

8.2 Experimental design

8.2.1 Number of groups

Numbers/gender/group sizes

8.2.2 Inclusion criteria

8.2.3 Exclusion criteria

8.2.4 Treatments/fasting

  • Compound

  • Route

  • Dose

  • Dose volume

  • Duration of dosing

  • Vehicle

8.2.5 End points

Clinical chemistries, blood chemistry and haematology (for example this may include urea, creatinine, glucose, total cholesterol, HDL-cholesterol, LDL-cholesterol, triglycerides, total protein, albumin, erythrocyte count, haemoglobin, haemocrit, platelets, white blood count, sodium, potassium, bilirubin, ALT, ALP, -GT.)

Urine chemistry (osmolality, ketones, pH, protein, glucose, bilirubin, blood, sediment and colour)

The request for inclusion and exclusion criteria for a study are related, and intended to define whether any confounding disease processes may influence the interpretation of the results (e.g. kidney dysfunction may influence urinary metabolites in a study on type II diabetes (Salek et al. in press)). The data reporting guidelines for this section will be made with respect to the Consort and Consort plus guidelines (http://rctbank.ucsf.edu/; http://www.consort-statement.org; http://rctbank.ucsf.edu/consort/cplus.html). While many of the committee extolled the virtues of having clinical data present as part of a metabolomic study, it was recognised that many studies would not have this data and hence clinical chemistry data is a requested (optional) field rather than required.

8.3 Metabolomics-related sample collection

8.3.1 Blood

  • Volume collected.

  • Location of collection.

  • Serum or plasma (anticoagulant); (Separation: if serum, time allowed for clotting and temperature, for plasma and serum temperature of centrifugation, time and speed of centrifugation)

  • Arterial or venous blood collected

  • Observations of haemolysis in samples and reporting of whether samples were used in subsequent analysis

  • Time from separation to freezing/freezing process.

  • Storage conditions (temperature, duration)

8.3.2 Urine

  • Frequency, volume of collection, mid flow or total urine

  • Bacteriostatic agent or any other additive (final concentration)

  • Storage conditions (temperature, duration)

8.3.3 Tissues

  • Identification

  • Approximate quantity taken

  • post mortem tissues (hours after death)

  • Storage conditions (temperature, duration)

An important quality control question associated with the use of post mortem tissue is whether post mortem changes may effect the conclusions drawn from a study. For this reason it is important to be able to assess whether tissue from diseased and control individuals have a similar post mortem delay between time of death and sample collection/storage.

9 Request for feedback

The reporting requirements detailed above are subject to revision. For more up to date versions of the requirements please refer to the project website: http://msi-workgroups.sourceforge.net/bio-metadata/reporting/ and mailing list http://msi-workgroups.sourceforge.net/. In particular the current committee is exclusively populated by scientists involved in mammalian metabolism studies and we would be very keen on feed back from those involved in other animal studies to examine how the reporting requirements can be developed further.

10 Discussion/conclusions

This paper provides a draft reporting standard requirement as the first stage of the MSI process. It specifies which data should be reported but does not provide details of how the data should be formatted or transmitted. As part of the on-going work of MSI, the Ontology Working Group is developing ontologies and vocabularies for reporting data (Sansone et al. 2007) and the Data Exchange Working Group is developing a data model and transmission format (Hardy and Taylor 2007) to support the requirements specified in this paper. Ultimately this will produce a reporting requirement for a metabolomic experiment, and allow users to collate and cross-compare their data between diverse sets of experiments.

One recurring issue during the discussions of these reporting standards was what information was required (necessary) as oppose to being requested (optional). The two areas, which created the most debate were diet and medical factors necessary to describe clinical populations. There have already been a relatively large number of papers detailing the effects of diet on the metabolome (Phipps et al. 1998; Stanley 2002; Lenz et al. 2003, 2004; German et al. 2005; Gibney et al. 2005), with these effects also arising from changes in the composition between batches of supposedly the same standardised feed. However, it was felt on balance that while the exact diet composition would be useful in assessing the validity of a metabolomic study, this was unlikely to be feasible for a lot of studies. Furthermore, in many clinical trials, epidemiology studies and nutritional interventions the reporting of diet may also be impractical.

For disease status of a human population there is a clear need to assess the influence of confounding factors within a population (e.g. Brindle et al. 2002; Kirschenlohr et al. 2006). However, the complication in creating a minimum reporting standard is that the risk/confounding factors will be study dependent. For example, although smoking and blood pressure are listed under the optional heading of further information, equally these fields may be appropriate to the inclusion and exclusion requirements for the study (for example in a study of the metabolic syndrome). This information should then either be included under the inclusion/exclusion criteria or under medical history where we state “all volunteers should not have factors in their medical history which confound the study.” This does require the user to consider what might be considered a confounding factor. However, by detailing these as a check list we encourage the experimenter to consider this as an issue in study design, and also alert the reviewer to the fact that certain medical conditions may be a required factor depending on the study. In a similar manner some aspects of clinical chemistry may be upgraded from the optional requested field to a necessary required field for certain diseases in order to properly define the population.

In our report on the minimum reporting requirements for the biological metadata associated with a mammalian metabolomic experiment, many of the fields described above should also apply to non-mammalian animal studies which are currently not represented by the biological context sub-committees. However, there is a need to recruit experts in the use of other laboratory animals than the mammalian species we have focussed our analysis on to date. In particular, we would welcome feedback from experts using the model organisms Caenorhabditis elegans (nematode), Drosophila melanogaster (fruit fly) and Danio rerio (zebra fish) involved in metabolomic studies.

There are also clear overlaps with the mammalian/in vivo context and some of the experiments discussed by the other sub-committees that form the biological context. For example, the clinical scheme described above should have many similarities with the requirements for an environmental experiment, while the pre-clinical scheme should show similarities with other laboratory based studies. Ultimately this will produce a single reporting requirement and schema for the biological context committee. As this is an on-going initiative we urge interested parties to provide feedback and comments to the open list: Msi-workgroups-feedback@lists.sourceforge.net.