Minimum Information and Quality Standards for Conducting, Reporting, and Organizing In Vitro Research
Insufficient description of experimental practices can contribute to difficulties in reproducing research findings. In response to this, “minimum information” guidelines have been developed for different disciplines. These standards help ensure that the complete experiment is described, including both experimental protocols and data processing methods, allowing a critical evaluation of the whole process and the potential recreation of the work. Selected examples of minimum information checklists with relevance for in vitro research are presented here and are collected by and registered at the MIBBI/FAIRsharing Information Resource portal.
In addition, to support integrative research and to allow for comparisons and data sharing across studies, ontologies and vocabularies need to be defined and integrated across areas of in vitro research. As examples, this chapter addresses ontologies for cells and bioassays and discusses their importance for in vitro studies.
Finally, specific quality requirements for important in vitro research tools (like chemical probes, antibodies, and cell lines) are suggested, and remaining issues are discussed.
KeywordsIn vitro research MIAME guidelines Minimum Information Ontologies Quality standards
1 Introduction: Why Details Matter
As laboratory workflows become increasingly diverse and complex, it has become more challenging to adequately describe the actual methodology followed. Efficient solutions that specify a minimum of information/items that clearly and transparently define all experimental reagents, procedures, data processing, and findings of a research study are required. This is important not only to fully understand the new information generated but also to provide sufficient details for other scientists to independently replicate and verify the results.
However, it can be very difficult to decide which parameters, settings, and experimental factors are critical and therefore need to be reported. Although the level of detail might differ, the need to define minimum information (MI) requirements to follow experiments in all different fields of life sciences is not a new phenomenon (Shapin and Schaffer 1985).
In 1657, Robert Boyle and his associate, Robert Hooke, designed an air pump in order to prove the existence of the vacuum, a space devoid of matter. At that time, Boyle’s air pump was the first scientific apparatus that produced vacuum – a controversial concept that many distinguished philosophers considered impossible. Inspired by Boyle’s success, the Dutch mathematician and scientist Christiaan Huygens built his own air pump in Amsterdam a few years later, which was the first machine built outside Boyle’s direct supervision. Interestingly, Huygens produced a phenomenon where water appeared to levitate inside a glass jar within the air pump. He called it “anomalous suspension” of water, an effect never noticed by Boyle. Boyle and Hooke could not replicate the effect in their air pump, and so the Royal Society (and with it, all of England) consequently rejected Huygens’ claims. After months of dispute, Huygens finally visited Boyle and Hooke in England in 1663 and managed to reproduce his results on Boyle’s own air pump. Following this, the anomalous suspension of water was accepted as a matter of fact, and Huygens was elected a Foreign Member of the Royal Society. In this way, a new form of presenting scientific experiments and studies emerged, and it enabled the reproduction of published results, thereby establishing the credibility of the author’s work. In this context, Robert Boyle is recognized as one of the first scientists to introduce the Materials and Methods section into scientific publications.
As science progressed, it became more and more obvious that further progress could only be achieved if new projects and hypotheses were built on results described by other scientists. Hence, new approaches were needed to report and standardize experimental procedures and to ensure the documentation of all essential and relevant information.
The example above speaks to the long-term value of thoroughly reporting materials and methods to the scientific process. In our own time, there has been ongoing discussion of a reproducibility crisis in science. When scientists were asked what contributes to irreproducible research, concerns were both behavioral and technical (Baker 2016). Importantly, the unavailability of methods, code, and raw data from the original lab was found to be “always or often” a factor for more than 40% and “sometimes” a factor for 80% of the 1,500 respondents. For in vitro research, low external validity can partially be explained by issues with incorrect cell lines that have become too common, including an example of a cell line that may never have existed as a unique entity (Lorsch et al. 2014). Additionally, the use of so-called big data requires sophisticated data organization if such data are to be meaningful to scientists other than the source laboratory. Deficiencies in annotation of such data restrict their utility, and capturing appropriate metadata is key to understanding how the data were generated and in facilitating novel analyses and interpretations.
2 Efforts to Standardize In Vitro Protocols
Today, most in vitro techniques not only require skilled execution and experimental implementation but also the handling of digital information and large, interlinked data files, the selection of the most appropriate protocol, and the integration of quality requirements to increase reproducibility and integrity of study results. Thus, the management and processing of data has become an integral part of the daily laboratory work. There is an increasing need to employ highly specialized techniques, but optimal standards may not be intuitive to scientists not experienced in a particular method or field.
This situation has led to a growing trend for communities of researchers to define “minimum information” (MI) checklists and guidelines for the description and contextualization of research studies. These MI standards facilitate sharing and publication of data, increase data quality, and provide guidance for researchers, publishers, and reviewers. In contrast to the recording/reporting of all the information generated during an experiment, MI specifications define specific information subsets, which need to be reported and should therefore be used to standardize the content of descriptions of protocols, materials, and methods.
2.1 The MIAME Guidelines
Microarrays have become a critical platform to compare different biological conditions or systems (e.g., organs, cell types, or individuals). Importantly, data obtained and published from such assays could only be understood by other scientists and analyzed in a meaningful manner if the biological properties of all samples (e.g., sample treatment and handling) and phenotypes were known. These accompanying microarray data, however, were initially deposited on authors’ websites in different formats or were not accessible at all. To address this issue, and given the often complex experimental microarray settings and the amount of data produced in a single experiment, information needed to be recorded systematically (Brazma 2009).
The raw data for each hybridization
The final processed (normalized) data for the set of hybridizations
The essential sample annotation, including experimental factors and their values
The experimental design, including sample data relationships (e.g., which raw data file relates to which sample, which hybridizations are technical or biological replicates)
Sufficient annotation of the array (e.g., the gene identifiers, genomic coordinates, probe oligonucleotide sequences, or reference commercial array catalogue number)
The essential laboratory and data processing protocols (e.g., which normalization method was used to obtain the final processed data)
Since its publication, the MIAME position paper has been cited over 4,100 times (as of August 2018; source: Google Scholar), demonstrating the commitment from the microarray community to these standards. Most of the major scientific journals now require authors to comply with the MIAME principles (Taylor et al. 2008). In addition, MIAME-supportive public repositories have been established, which enable the deposition and accession of experimental data and provide a searchable index functionality, enabling results to be used for new analyses and interpretations. For annotating and communicating MIAME-compliant microarray data, the spreadsheet-based MAGE-TAB (MicroArray Gene Expression Tabular) format has been developed by the FGED Society. Documents in this format can be created, viewed, and edited using commonly available spreadsheet software (e.g., Microsoft Excel) and will support the collection as well as the exchange of data between tools and databases, including submissions to public repositories (Rayner et al. 2006).
2.2 The MIBBI Portal
The success of MIAME spurred the development of appropriate guidelines for many different in vitro disciplines, summarized and collected at the Minimum Information about Biological and Biomedical Investigations (MIBBI) portal. MIBBI was created as an open-access, online resource for MI checklist projects, thereby harmonizing the various checklist development efforts (Taylor et al. 2008). MIBBI is managed by representatives of its various participant communities, which is especially valuable since it combines standards and information from several distinct disciplines.
Since 2011, MIBBI has evolved into the FAIRsharing Information Resource (https://fairsharing.org/collection/MIBBI). Being an extension of MIBBI, FAIRsharing collects and curates reporting standards, catalogues bioscience data policies, and hosts a communication forum to maintain linkages between funders, journals, and leaders of the standardization efforts. Importantly, records in FAIRsharing are both manually curated by the FAIRsharing team and edited by the research community. The FAIRsharing platform also provides a historical overview to understand versions of guidelines and policies, as well as database updates (McQuilton et al. 2016). In summary, the MIBBI/FAIRsharing initiative aims to increase connectivity between minimum information checklist projects to unify the standardization community and to maximize visibility for guideline and database developers.
Examples of minimum information checklists from different disciplines, to ensure the reproducibility and appropriate interpretability of experiments within their domains
Experimental guidelines, quality standards, uniform analysis pipeline, software tools, and ontologies for epigenetic experiments
Descriptions of interacting entities: small molecules, therapeutic proteins, peptides, carbohydrates, food additives
EMBL-EBI industry program
Specification of microarray experiments: raw data, processed data, sample annotation, experimental design, annotation of the array, laboratory and data processing protocols
Minimum set of information about a proteomics experiment
Human proteome organization (HUPO) proteomics
Flow cytometry experimental overview, sample description, instrumentation, reagents, and data analysis
International Society for Analytical Cytology (ISAC)
Minimum information guidelines for molecular interaction experiments
Quantitative PCR assay checklist, including experimental design, sample, nucleic acid extraction, reverse transcription, target information, oligonucleotides, protocol, validation, and data analysis
Group of research-active scientists
Specifications for in situ hybridization and IHC experiments: experimental design, biomaterials and treatments, reporters, staining, imaging data, and image characterization
NIH/NIDDK stem cell genome anatomy projects consortium
Reagents and conditions used for enzyme activity and enzyme inhibition studies
http://www.beilstein-institut.de/en/projects/strenda (Tipton et al. 2014)
2.3 Protocol Repositories
The MI approach ensures the adequacy of reported information from each study. Increasingly, scientific data are organized in databases into which dispersed groups contribute data. Biopharma databases that capture assay data on drug candidate function and disposition serve as an example, although the concepts discussed here apply generally. Databases of bioassay results often focus on final results and key metadata, and methodology is typically shared in separate protocols. Protocol repositories are used to provide accessibility, transparency, and consistency. At its simplest, a protocol repository may consist of short prose descriptions, like those used in journal articles or patent applications, and they may be stored in a shared location. A set of minimal information is necessary, but the unstructured format makes manual curation essential and tedious. Greater functionality is provided by a spreadsheet or word processor file template with structured information. For this medium solution, files can be managed as database attachments or on a web-based platform (e.g., SharePoint or Knowledge Notebook) that supports filtering, searching, linking, version tracking, and the association of limited metadata. Curation is still manual, but the defined format facilitates completeness. The most sophisticated option is a protocol database with method information contained in structured database tables. Benefits include the ability to search, filter, sort, and change at the resolution of each of the contributing pieces of data and metadata, as opposed to managing the file as a whole. In addition, the protocol database can mandate the completion of all essential fields as a condition for completing protocol registration, thereby minimizing the burden on curation. These approaches build on each other, such that completion of a simple solution facilitates implementation of the next level of functionality.
3 The Role of Ontologies for In Vitro Studies
Ontologies are a set of concepts and categories that establish the properties and relationships within a subject area. Ontologies are imperative in organizing sets of data by enabling the assignment of like and unlike samples or conditions, a necessary prelude to drawing insights on similarities and differences between experimental groups. Insufficient ontology harmonization is a limiting factor for the full utilization of large data sets to compare data from different sources. In addition, ontologies can facilitate compliance with method-reporting standards by defining minimal information fields for a method, such as those in Table 1, whose completion can be set as a condition for data deposition. Perhaps the most fundamental ontology in the life sciences is the Gene Ontology (http://www.geneontology.org/) (Gene Ontology Consortium 2001), on which others build to categorize increasing levels of complexity from gene to transcript to protein. The Ontology for Biomedical Investigations (http://obi-ontology.org/) was established to span the medical and life sciences and provides a general scope (Bandrowski et al. 2016). We will discuss two specific ontologies that are particularly relevant to quality and reproducibility of in vitro experiments, those that address cells and bioassays.
3.1 Ontologies for Cells and Cell Lines
Nearly all of the in vitro methods discussed in Sect. 2 above start with a cell-derived sample, and that cell identity is crucial for reproducibility and complete data utilization. The Cell Ontology (CL; http://cellontology.org/) provides a structured and classified vocabulary for natural cell types (Bard et al. 2005; Diehl et al. 2016). The Cell Line Ontology (CLO; http://www.clo-ontology.org/) was created to categorize cell lines, defined as a “genetically stable and homogeneous population of cultured cells that share a common propagation history” (Sarntivijai et al. 2008, 2014). The CL and CLO enable unambiguous identification of the sample source and are thus critical for data quality. The Encyclopedia of DNA Elements (ENCODE) Project employed the CL to organize its database of genomic annotations for human and mouse with over 4,000 experiments in more than 350 cell types (Malladi et al. 2015). The FANTOM5 effort uses the CL to classify transcriptomic data from more than 1,000 human and mouse samples (Lizio et al. 2015). An appropriate cell ontology is now required as a metadata standard for large data sets within the transcriptomic and functional genomics fields, and it is anticipated that additional areas will mandate this (Diehl et al. 2016).
3.2 The BioAssay Ontology
Bioassay databases store information on drug candidates and represent very large and heterogeneous collections of data that can be challenging to organize and for which there is a need to continually draw novel insights. To address these and other challenges, the BioAssay Ontology (BAO; http://bioassayontology.org) was created as the set of concepts and categories that define bioassays and their interrelationships (Vempati and Schurer 2004; Visser et al. 2011). The BAO is organized into main hierarchies that describe assay format, assay design, the target of the assay or the metatarget, any perturbagen used, and the detection technology utilized. The BAO can also define if an assay has a confirmatory, counter-screen, or other relationship to another assay. Excellent ontologies already exist for several aspects used in the BAO, and so the BAO is integrated with the Gene Ontology (Ashburner et al. 2000), the Cell Line Ontology (Sarntivijai et al. 2008), protein names from UniProt (http://www.uniprot.org), and the Unit Ontology (http://bioportal.bioontology.org/visualize/45500/). Many terms exist beneath the hierarchy summarized above, and they use a defined vocabulary. These terms constitute a set of metadata that collectively describe a unique bioassay.
3.3 Applications of the BAO to Bioassay Databases
PubChem and ChEMBL are notable publicly accessible databases with screening results on the bioactivity of millions of molecules. But the manner in which these data are structured limits their utility. For example, PubChem lists reagent concentrations in column headers, rather than as a separate field. To address this and other data organization issues, the BioAssay Research Database (BARD; https://bard.nih.gov/) was developed by seven National Institutes of Health (NIH) and academic centers (Howe et al. 2015). The BARD utilizes the BAO to organize data in the NIH Molecular Libraries Program, with over 4,000 assay definitions. In addition, BARD provides a data deposition interface that captures the appropriate metadata to structure the bioassay data.
Data organization with the BAO enables new analyses and insights. Scientists at AstraZeneca used the BAO to annotate their collection of high-throughput screening data and compared their set of assays with those available in PubChem (Zander Balderud et al. 2015). They extracted metrics on the utilization of assay design and detection approaches and considered over- vs. underutilization of potential technologies. BAO terms were also used to identify similar assays from which hits were acknowledged as frequent false-positive results in high-throughput screening (Moberg et al. 2014; Schürer et al. 2011).
As an example of a BAO implementation, AbbVie’s Platform Informatics and Knowledge Management organization negotiated with representatives from the relevant scientific functions to determine those categories from the BAO that minimally defined classes of their assays. New assays are created by choosing from within the defined vocabularies, and the assay names are generated by the concatenation of those terms, e.g., 2nd messenger__ADRB2__HEK293__Isoprenaline__Human__cAMP__Antagonist__IC50 for an assay designed to measure antagonism of the β2-adrenergic receptor. Unforeseen assay variables are accommodated by adding new terms within the same categories. AbbVie has found that adopting the BAO has reduced the curation burden, accelerated the creation of new assays, and eased the integration of external data sources.
4 Specific Examples: Quality Requirements for In Vitro Research
4.1 Chemical Probes
Chemical probes have a central role in advancing our understanding of biological processes. They provide the opportunity to modulate a biologic system without the compensatory mechanisms that come with genetic approaches, and they mimic the manner in which a disease state can be treated with small-molecule therapeutics. In the field of receptor pharmacology, some families bear the name of the natural chemical probe that led to their identification: opiate, muscarinic, nicotinic, etc. However, not all chemical probes are equal. Davies et al. described the selectivity of 28 frequently used protein kinase inhibitors and found that none were uniquely active on their presumed target at relevant concentrations (Davies et al. 2000). A probe must also be available at the site of action at sufficient concentration to modulate the target. Unfortunately, peer-reviewed articles commonly use chemical probes without reference to selectivity, solubility, or in vivo concentration.
To promote the use of quality tools, the Chemical Probes Portal (www.chemicalprobes.org) was created as a community-driven Wiki that compiles characterization data, describes optimal working conditions, and grades probes on the appropriateness of their use (Arrowsmith et al. 2015). New probes are added to the Wiki only after curation. The portal captures characterization of potency, selectivity, pharmacokinetics, and tolerability data. It is hoped that use of this portal becomes widespread, both in terms of the evaluation of probes as well as their application.
4.2 Cell Line Authentication
Cells capable of proliferating under laboratory conditions are essential tools for the study of cellular mechanisms and to define disease molecular markers and evaluate therapeutic candidates. But cell line misidentification and cross-contamination have been recognized since the 1960s (Gartler 1967; Nelson-Rees and Flandermeyer 1977). Recent examples include esophageal cell lines, which were used in over 100 publications before it was shown that they actually originated from other parts of the body (Boonstra et al. 2010). The first gastric MALT lymphoma cell line (MA-1) was described as a model for this disease in 2011. Due to misidentification, MA-1 turned out to be the already known Pfeiffer cell line, derived from diffuse large B-cell lymphoma (Capes-Davis et al. 2013). The RGC-5 rat retinal ganglion cell line was used in at least 230 publications (On authentication of cell lines 2013) but was later identified by the lab in which it originated to actually be the same as the mouse 661 W cell line, derived from photoreceptor cells (Krishnamoorthy et al. 2013). A list of over 480 known misidentified cell lines (as of August 2018) is available from the International Cell Line Authentication Committee (ICLAC; http://iclac.org/). It shows that a large number of cell lines have been found to be contaminated – HeLa cells, the first established cancer cell line, are the most frequent contaminant. It is therefore critical to ensure that all cell lines used in in vitro studies are authentic. In fact, expectations for the proper identification of cell lines have been communicated both by journal editors (On authentication of cell lines 2013) and by the National Institutes of Health (Notice Regarding Authentication of Cultured Cell Lines, NOT-OD-08-017 2007).
Short tandem repeat (STR) profiling compares the genetic signature of a particular cell line with an established database and is the standard method for unambiguous authentication of cell lines. An 80% or higher match in profiled STR loci is recommended for cell line authentication following the ANSI/ATCC ASN-0002-2011 Authentication of Human Cell Lines: Standardization of STR Profiling (http://webstore.ansi.org). This standard was developed in 2011 by the American Type Culture Collection (ATCC) working group of scientists from academia, regulatory agencies, major cell repositories, government agencies, and industry.
To provide support for bench scientists working with cell lines and to establish principles for standardization, rationalization, and international harmonization of cell and tissue culture laboratory practices, minimal requirements for quality standards in cell and tissue culture were defined (Good Cell Culture Practice) (Coecke et al. 2005), and the “guidelines for the use of cell lines” were published (Geraghty et al. 2014).
4.3 Antibody Validation
There is growing attention to the specificity and sensitivity of commercial antibodies for research applications, with respect to intended application. For example, an antibody validated for an unfolded condition (Western blotting) may not work in native context assays (immunohistochemistry or immunoprecipitation) and vice versa. In addition, lot variation can be a concern, particularly for polyclonal antibodies and particularly when raised against an entire protein and undefined epitope. Therefore, validation steps are warranted for each new batch.
A specific example where differences in detection antibodies have provided conflicting results comes from the field of neutrophil extracellular traps (NETs), which are extended complexes of decondensed chromatin and intracellular proteins. Measurement of NETs commonly uses confocal microscopy to image extracellular chromatin bound to histones, whose arginine sidechains have undergone enzymatic deimidation (conversion of arginine to citrulline, a process termed citrullination). There are conflicting reports about the presence and nature of citrullinated histones in these NET structures (Li et al. 2010; Neeli and Radic 2013). A recent report compared samples from ten NETosis stimuli using six commercially available antibodies that recognize citrullinated histones (Neeli and Radic 2016), four of which are specific for citrullinated histone H3. The report found significant differences in the number and intensity of Western blot signals detected by these antibodies, some of which were dependent on the NETosis stimulus. Since each H3 citrullination site is adjacent to a lysine known to undergo epigenetic modification, changes in epitope structure may be confounding measurements of citrullination, particularly for antibodies raised against synthetic peptides that lack lysine modification.
IWGAV strategy for antibody validation
Number of antibodies
Knockout or knockdown cells/tissues
Antibody of interest
Antibody-based method of choice
Antibody of interest
Correlation between antibody-based and antibody-independent assays
Lysate or tissue with target protein
Several independent antibodies with different epitopes
Specificity analysis through comparative and quantitative analysis
Lysate or tissue containing tagged and native protein
Anti-tag antibody compared with antibody of interest
Correlating the signal from the tagged and non-tagged proteins
Immunocapture followed by MS
Lysate containing protein of interest
Antibody of interest
Target immunocapture and mass spectrometry of target and potential binding partners
4.4 Webtools Without Minimal Information Criteria
Many websites serve as helpful compendiums for a diverse range of biological disciplines. Some examples include BRENDA (http://www.brenda-enzymes.org) for enzyme kinetic data, SABIO-RK (http://sabio.villa-bosch.de) for enzymatic reactions, ToxNet (https://toxnet.nlm.nih.gov/) for chemical toxicology, and Medicalgenomics (http://www.medicalgenomics.org/) for gene expression and molecular associations. But a lack of minimal information about experimental conditions and nonstandard ontologies often prevent the use of scientific websites to answer questions about complex systems. For example, BRENDA uses the Enzyme Commission system as ontology, but multiple gene products fall within the same reaction category. Also, comparing enzyme-specific activities without greater detail on methods is challenging. It is important to note, however, that enabling systems-wide bioinformatics analyses is not the purpose of these sites and was not envisioned when these tools were developed. Nevertheless, these sites still serve an essential function by indicating the existence of and references to data not otherwise searchable and are an evolutionary step forward in the accessibility of biological data.
4.5 General Guidelines for Reporting In Vitro Research
Guidance for reporting of in vitro studies and methodologies
An ethical statement should indicate the ethical review process and permissions for any materials derived from human volunteers, including appropriate privacy assurances
Experimental procedures should follow MI guidelines wherever such exist. Where nonexistent, the method details given must be sufficient to reproduce the work. Parameters to consider include buffer (e.g., cell culture medium) and lysis (e.g., for cell-based studies) conditions, sample preparation and handling, volumes, concentrations, temperatures, and incubation times. Complex procedures may require a flow diagram, and novel equipment may require a picture
Commercial materials (cells, antibodies, enzymes or other proteins, nucleic acids, chemicals) should include the vendor, catalogue number, and lot number. Non-commercially sourced materials should include the quality control analyses performed to validate their identity, purity, biological activity, etc. (e.g., sequencing to confirm the correct sequence of cDNA plasmids). Similar analyses should be performed on commercial material where not supplied by the vendor, in a manner appropriate to its intended purpose
The source of producing recombinant proteins should be disclosed, including the sequence, expression system, purification, and analyses for purity and bioactivity. Proteins produced from bacteria should be measured for endotoxin prior to any use on live cells (or animals)
Inhibitors and compounds
For inhibitors and chemical compounds, it should be stated whether or not specificity screenings to identify potential off-target effects have been performed
The method for purifying or preparing primary cells should be stated clearly. Cell lines should have their identity verified as described in Sect. 4.2 above, and cross-contamination should be checked regularly. Furthermore, routine testing should be performed for successful control of mycoplasma contamination. The passage number should be given, as over-passaging of cells can potentially lead to experimental artifacts. Alternatively, purchasing fresh cells for a set of experiments from a recognized animal cell culture repository (such as the American Type Culture Collection, Manassas, Virginia, or the Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures) may be an attractive option from both a logistic and cost perspective. Where studies monitor functional changes in live cells, results should be interpreted in respect to parallel viability/cytotoxicity measurements, particularly where a loss of function is observed
The specificity and possible cross-reactivity of each antibody used need to be controlled. This applies to internally generated as well as commercially available antibodies. Relevant procedures to validate an antibody for a specific method or technique are given in Table 2. Details about performed experiments to investigate antibody specificity should be described
The size of experimental and control groups should be indicated, and it should distinguish between biological and technical replicates and between distinct experiments. It should be stated whether or not randomization steps have been used to control for the spatial arrangement of samples (e.g., to avoid technical artifacts when using multi-well microtiter plates) and the order of sample collection and processing (e.g., when there is a circadian rhythm or time-of-day effect)
The type of statistical analyses should be stated clearly, including the parameter represented by any error bars. In addition, an explicit statement of how many experiments/data points were excluded from analysis and how often experiments were repeated should be included
5 Open Questions and Remaining Issues
The measures and initiatives described above for the high-quality annotation of methods and data sets were designed to increase the total value derived from (in vitro) biomedical research. However, some open questions and issues remain.
5.1 Guidelines vs. Standards
MI checklists for in vitro research, such as MIAME, are reporting guidelines and cannot describe all factors that could potentially affect the outcome of an experiment and should therefore be considered when planning and reporting robust and reproducible studies. MI guidelines address the question “What information is required to reproduce experiments?” rather than “Which parameters are essential, and what are potential sources of variation?” However, answering both questions is necessary to increase data robustness and integrity. Reporting guidelines alone cannot prevent irreproducible research. A newly established initiative, the RIPOSTE framework (Masca et al. 2015), therefore aims to increase data reproducibility by encouraging early discussions of study design and planning within a multidisciplinary team (including statisticians), at the time when specific questions or hypotheses are proposed.
To avoid misunderstandings, it is essential to distinguish between “guidelines” and “standards.” Broadly written, guidelines do not specify the minimum threshold for data recording and reporting, and for study interpretation, but rather serve as an important starting point for the development of community-supported and vetted standards. Standards, in contrast, must define their elements clearly, specifically, and unambiguously, including the information detail necessary to fulfill all standard requirements (Burgoon 2006). As an example, the MIAME guidelines are often referred to as a standard. However, all possible experimental settings and specifications (see Sect. 2.1) are not defined by MIAME, leading to potentially alternative interpretations and therefore heterogeneous levels of experimental detail. Consequently, different MIAME-compliant studies may not collect or provide the exact same information about an experiment, complicating true data sharing and communication (Burgoon 2006).
5.2 Compliance and Acceptance
To achieve the highest acceptance and compliance among scientists, journals, database curators, funding agencies, and all other stakeholders, MI guidelines need to maintain a compromise between detail requirements and practicality in reporting, so that compliance with the developed guidelines remains practical, efficient, and realistic to implement. An evaluation of 127 microarray articles published between July 2011 and April 2012 revealed that ~75% of these publications were not compliant with the MIAME guidelines (Witwer 2013). A survey of scientists attending the 2017 European Calcified Tissue Society meeting found that a majority were familiar with performing RT-qPCR experiments, but only 6% were aware of the MIQE guidelines (Bustin 2017). These examples show that the engagement of and comprehensive vetting by the scientific community is critical for the successful adoption of workable guidelines and standards. Additional challenges to compliance involve the publication process. Some journals impose space limitations but don’t support supplemental material. Some journals may encourage MI approaches in their instructions to authors, but reviewers may not be sufficiently versed in all of the reported methodology to adequately critique them or to request compliance with MI approaches.
5.3 Coordinated Efforts
The Minimum Information checklists are usually developed independently from each other. Consequently, some guidelines can be partially redundant and overlapping. Although differences in wording/nomenclature and substructuring will complicate an integration, these overlaps need to be resolved through a coordinated effort and to the satisfaction of all concerned parties.
5.4 Format and Structured Data
The three basic components of a modern reporting structure are MI specifications (see Sect. 2), controlled vocabularies (see Sect. 3), and data formats (Taylor 2007). Most MI guidelines do not provide a standard format or structured templates for presenting experimental results and accompanying information, for transmitting information from data entry to analysis software, or for the storage of data in repositories. For some guidelines (e.g., MIAME and the MAGE-TAB format), data exchange formats were developed to support scientists, but finding the perfect compromise between ease of use and level of complexity so that a standard format for most guidelines is accepted by the research community still remains a challenge.
6 Concluding Remarks
Undoubtedly, requirements regarding reporting, ontologies, research tools, and data standards will improve robustness and reproducibility of in vitro research and will facilitate the exchange and analysis of future research. In the meantime, all different stakeholders and research communities need to be engaged to ensure that the various guideline development projects and initiatives are coordinated and harmonized in a meaningful way.
The authors would like to thank Chris Butler of AbbVie for helpful discussions.
- Arrowsmith CH, Audia JE, Austin C, Baell J, Bennett J, Blagg J, Bountra C, Brennan PE, Brown PJ, Bunnage ME, Buser-Doepner C, Campbell RM, Carter AJ, Cohen P, Copeland RA, Cravatt B, Dahlin JL, Dhanak D, Edwards AM, Frederiksen M, Frye SV, Gray N, Grimshaw CE, Hepworth D, Howe T, Huber KVM, Jin J, Knapp S, Kotz JD, Kruger RG, Lowe D, Mader MM, Marsden B, Mueller-Fahrnow A, Müller S, O’Hagan RC, Overington JP, Owen DR, Rosenberg SH, Ross R, Roth B, Schapira M, Schreiber SL, Shoichet B, Sundström M, Superti-Furga G, Taunton J, Toledo-Sherman L, Walpole C, Walters MA, Willson TM, Workman P, Young RN, Zuercher WJ (2015) The promise and peril of chemical probes. Nat Chem Biol 11:536. https://doi.org/10.1038/nchembio.1867 CrossRefPubMedPubMedCentralGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet 25(1):25–29. https://doi.org/10.1038/75556 CrossRefPubMedPubMedCentralGoogle Scholar
- Bandrowski A, Brinkman R, Brochhausen M, Brush MH, Bug B, Chibucos MC, Clancy K, Courtot M, Derom D, Dumontier M, Fan L, Fostel J, Fragoso G, Gibson F, Gonzalez-Beltran A, Haendel MA, He Y, Heiskanen M, Hernandez-Boussard T, Jensen M, Lin Y, Lister AL, Lord P, Malone J, Manduchi E, McGee M, Morrison N, Overton JA, Parkinson H, Peters B, Rocca-Serra P, Ruttenberg A, Sansone SA, Scheuermann RH, Schober D, Smith B, Soldatova LN, Stoeckert CJ Jr, Taylor CF, Torniai C, Turner JA, Vita R, Whetzel PL, Zheng J (2016) The Ontology for biomedical investigations. PLoS One 11(4):e0154556. https://doi.org/10.1371/journal.pone.0154556 CrossRefPubMedPubMedCentralGoogle Scholar
- Boonstra JJ, van Marion R, Beer DG, Lin L, Chaves P, Ribeiro C, Pereira AD, Roque L, Darnton SJ, Altorki NK, Schrump DS, Klimstra DS, Tang LH, Eshleman JR, Alvarez H, Shimada Y, van Dekken H, Tilanus HW, Dinjens WN (2010) Verification and unmasking of widely used human esophageal adenocarcinoma cell lines. J Natl Cancer Inst 102(4):271–274. djp499 [pii]. https://doi.org/10.1093/jnci/djp499 CrossRefPubMedPubMedCentralGoogle Scholar
- Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29(4):365–371. https://doi.org/10.1038/ng1201-365 CrossRefPubMedGoogle Scholar
- Bustin SA, Benes V, Garson JA, Hellemans J, Huggett J, Kubista M, Mueller R, Nolan T, Pfaffl MW, Shipley GL, Vandesompele J, Wittwer CT (2009) The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin Chem 55(4):611–622.. clinchem.2008.112797 [pii]. https://doi.org/10.1373/clinchem.2008.112797 CrossRefPubMedGoogle Scholar
- Capes-Davis A, Alston-Roberts C, Kerrigan L, Reid YA, Barrett T, Burnett EC, Cooper JR, Freshney RI, Healy L, Kohara A, Korch C, Masters JR, Nakamura Y, Nims RW, Storts DR, Dirks WG, MacLeod RA, Drexler HG (2013) Beware imposters: MA-1, a novel MALT lymphoma cell line, is misidentified and corresponds to Pfeiffer, a diffuse large B-cell lymphoma cell line. Genes Chromosomes Cancer 52(10):986–988. https://doi.org/10.1002/gcc.22094 CrossRefPubMedGoogle Scholar
- CiteAB (2017) The antibody search engine. https://www.citeab.com/
- Deutsch EW, Ball CA, Berman JJ, Bova GS, Brazma A, Bumgarner RE, Campbell D, Causton HC, Christiansen JH, Daian F, Dauga D, Davidson DR, Gimenez G, Goo YA, Grimmond S, Henrich T, Herrmann BG, Johnson MH, Korb M, Mills JC, Oudes AJ, Parkinson HE, Pascal LE, Pollet N, Quackenbush J, Ramialison M, Ringwald M, Salgado D, Sansone SA, Sherlock G, Stoeckert CJ Jr, Swedlow J, Taylor RC, Walashek L, Warford A, Wilkinson DG, Zhou Y, Zon LI, Liu AY, True LD (2008) Minimum information specification for in situ hybridization and immunohistochemistry experiments (MISFISHIE). Nat Biotechnol 26(3):305–312.. nbt1391 [pii]. https://doi.org/10.1038/nbt1391 CrossRefPubMedPubMedCentralGoogle Scholar
- Diehl AD, Meehan TF, Bradford YM, Brush MH, Dahdul WM, Dougall DS, He Y, Osumi-Sutherland D, Ruttenberg A, Sarntivijai S, van Slyke CE, Vasilevsky NA, Haendel MA, Blake JA, Mungall CJ (2016) The cell ontology 2016: enhanced content, modularization, and ontology interoperability. J Biomed Semant 7(1):44. https://doi.org/10.1186/s13326-016-0088-7 CrossRefGoogle Scholar
- GBSI (2017) Global biological standards institute. https://www.gbsi.org/
- Geraghty RJ, Capes-Davis A, Davis JM, Downward J, Freshney RI, Knezevic I, Lovell-Badge R, Masters JR, Meredith J, Stacey GN, Thraves P, Vias M (2014) Guidelines for the use of cell lines in biomedical research. Br J Cancer 111(6):1021–1046. bjc2014166 [pii]. https://doi.org/10.1038/bjc.2014.166 CrossRefPubMedPubMedCentralGoogle Scholar
- Gilda JE, Ghosh R, Cheah JX, West TM, Bodine SC, Gomes AV (2015) Western blotting inaccuracies with unverified antibodies: need for a Western blotting minimal reporting standard (WBMRS). PLoS One 10(8):e0135392. https://doi.org/10.1371/journal.pone.0135392. PONE-D-15-18879 [pii]CrossRefPubMedPubMedCentralGoogle Scholar
- Howe EA, de Souza A, Lahr DL, Chatwin S, Montgomery P, Alexander BR, Nguyen DT, Cruz Y, Stonich DA, Walzer G, Rose JT, Picard SC, Liu Z, Rose JN, Xiang X, Asiedu J, Durkin D, Levine J, Yang JJ, Schurer SC, Braisted JC, Southall N, Southern MR, Chung TD, Brudz S, Tanega C, Schreiber SL, Bittker JA, Guha R, Clemons PA (2015) BioAssay research database (BARD): chemical biology and probe-development enabled by structured metadata and result types. Nucleic Acids Res 43:D1163–D1170. https://doi.org/10.1093/nar/gku1244 CrossRefPubMedGoogle Scholar
- Lee JA, Spidlen J, Boyce K, Cai J, Crosbie N, Dalphin M, Furlong J, Gasparetto M, Goldberg M, Goralczyk EM, Hyun B, Jansen K, Kollmann T, Kong M, Leif R, McWeeney S, Moloshok TD, Moore W, Nolan G, Nolan J, Nikolich-Zugich J, Parrish D, Purcell B, Qian Y, Selvaraj B, Smith C, Tchuvatkina O, Wertheimer A, Wilkinson P, Wilson C, Wood J, Zigon R, Scheuermann RH, Brinkman RR (2008) MIFlowCyt: the minimum information about a flow cytometry experiment. Cytometry A 73(10):926–930. https://doi.org/10.1002/cyto.a.20623 CrossRefPubMedPubMedCentralGoogle Scholar
- Lizio M, Harshbarger J, Shimoji H, Severin J, Kasukawa T, Sahin S, Abugessaisa I, Fukuda S, Hori F, Ishikawa-Kato S, Mungall CJ, Arner E, Baillie JK, Bertin N, Bono H, de Hoon M, Diehl AD, Dimont E, Freeman TC, Fujieda K, Hide W, Kaliyaperumal R, Katayama T, Lassmann T, Meehan TF, Nishikata K, Ono H, Rehli M, Sandelin A, Schultes EA, ‘t Hoen PA, Tatum Z, Thompson M, Toyoda T, Wright DW, Daub CO, Itoh M, Carninci P, Hayashizaki Y, Forrest AR, Kawaji H (2015) Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol 16(1):22. https://doi.org/10.1186/s13059-014-0560-6 CrossRefPubMedPubMedCentralGoogle Scholar
- Malladi VS, Erickson DT, Podduturi NR, Rowe LD, Chan ET, Davidson JM, Hitz BC, Ho M, Lee BT, Miyasato S, Roe GR, Simison M, Sloan CA, Strattan JS, Tanaka F, Kent WJ, Cherry JM, Hong EL (2015) Ontology application and use at the ENCODE DCC. Database 2015:bav010. https://doi.org/10.1093/database/bav010 CrossRefPubMedPubMedCentralGoogle Scholar
- Masca NG, Hensor EM, Cornelius VR, Buffa FM, Marriott HM, Eales JM, Messenger MP, Anderson AE, Boot C, Bunce C, Goldin RD, Harris J, Hinchliffe RF, Junaid H, Kingston S, Martin-Ruiz C, Nelson CP, Peacock J, Seed PT, Shinkins B, Staples KJ, Toombs J, Wright AK, Teare MD (2015) RIPOSTE: a framework for improving the design and analysis of laboratory-based research. eLife 4. https://doi.org/10.7554/eLife.05519
- McQuilton P, Gonzalez-Beltran A, Rocca-Serra P, Thurston M, Lister A, Maguire E, Sansone SA (2016) BioSharing: curated and crowd-sourced metadata standards, databases and data policies in the life sciences. Database (Oxford) 2016:baw075. https://doi.org/10.1093/database/baw075 CrossRefGoogle Scholar
- Moberg A, Zander Balderud L, Hansson E, Boyd H (2014) Assessing HTS performance using BioAssay Ontology: screening and analysis of a bacterial phospho-N-acetylmuramoyl-pentapeptide translocase campaign. Assay Drug Dev Technol 12(9–10):506–513. https://doi.org/10.1089/adt.2014.595 CrossRefPubMedPubMedCentralGoogle Scholar
- Notice Regarding Authentication of Cultured Cell Lines, NOT-OD-08-017 (2007). https://grants.nih.gov/grants/guide/notice-files/not-od-08-017.html. Accessed 28 Nov 2007
- On Authentication of Cell Lines (2013) Mol Vis 19:1848–1851Google Scholar
- Orchard S, Salwinski L, Kerrien S, Montecchi-Palazzi L, Oesterheld M, Stumpflen V, Ceol A, Chatr-aryamontri A, Armstrong J, Woollard P, Salama JJ, Moore S, Wojcik J, Bader GD, Vidal M, Cusick ME, Gerstein M, Gavin AC, Superti-Furga G, Greenblatt J, Bader J, Uetz P, Tyers M, Legrain P, Fields S, Mulder N, Gilson M, Niepmann M, Burgoon L, De Las Rivas J, Prieto C, Perreau VM, Hogue C, Mewes HW, Apweiler R, Xenarios I, Eisenberg D, Cesareni G, Hermjakob H (2007) The minimum information required for reporting a molecular interaction experiment (MIMIx). Nat Biotechnol 25(8):894–898. nbt1324 [pii]. https://doi.org/10.1038/nbt1324 CrossRefPubMedGoogle Scholar
- Orchard S, Al-Lazikani B, Bryant S, Clark D, Calder E, Dix I, Engkvist O, Forster M, Gaulton A, Gilson M, Glen R, Grigorov M, Hammond-Kosack K, Harland L, Hopkins A, Larminie C, Lynch N, Mann RK, Murray-Rust P, Lo Piparo E, Southan C, Steinbeck C, Wishart D, Hermjakob H, Overington J, Thornton J (2011) Minimum information about a bioactive entity (MIABE). Nat Rev Drug Discov 10(9):661–669. https://doi.org/10.1038/nrd3503 CrossRefPubMedGoogle Scholar
- Rayner TF, Rocca-Serra P, Spellman PT, Causton HC, Farne A, Holloway E, Irizarry RA, Liu J, Maier DS, Miller M, Petersen K, Quackenbush J, Sherlock G, Stoeckert CJ Jr, White J, Whetzel PL, Wymore F, Parkinson H, Sarkans U, Ball CA, Brazma A (2006) A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinf 7:489. https://doi.org/10.1186/1471-2105-7-489 CrossRefGoogle Scholar
- Sarntivijai S, Lin Y, Xiang Z, Meehan TF, Diehl AD, Vempati UD, Schurer SC, Pang C, Malone J, Parkinson H, Liu Y, Takatsuki T, Saijo K, Masuya H, Nakamura Y, Brush MH, Haendel MA, Zheng J, Stoeckert CJ, Peters B, Mungall CJ, Carey TE, States DJ, Athey BD, He Y (2014) CLO: the cell line ontology. J Biomed Semant 5:37. https://doi.org/10.1186/2041-1480-5-37 CrossRefGoogle Scholar
- Shapin S, Schaffer S (1985) Leviathan and the air-pump. Princeton University Press, PrincetonGoogle Scholar
- Sloan CA, Chan ET, Davidson JM, Malladi VS, Strattan JS, Hitz BC, Gabdank I, Narayanan AK, Ho M, Lee BT, Rowe LD, Dreszer TR, Roe G, Podduturi NR, Tanaka F, Hong EL, Cherry JM (2016) ENCODE data at the ENCODE portal. Nucleic Acids Res 44(D1):D726–D732. https://doi.org/10.1093/nar/gkv1160 CrossRefPubMedGoogle Scholar
- Taylor CF, Field D, Sansone SA, Aerts J, Apweiler R, Ashburner M, Ball CA, Binz PA, Bogue M, Booth T, Brazma A, Brinkman RR, Michael Clark A, Deutsch EW, Fiehn O, Fostel J, Ghazal P, Gibson F, Gray T, Grimes G, Hancock JM, Hardy NW, Hermjakob H, Julian RK Jr, Kane M, Kettner C, Kinsinger C, Kolker E, Kuiper M, Le Novere N, Leebens-Mack J, Lewis SE, Lord P, Mallon AM, Marthandan N, Masuya H, McNally R, Mehrle A, Morrison N, Orchard S, Quackenbush J, Reecy JM, Robertson DG, Rocca-Serra P, Rodriguez H, Rosenfelder H, Santoyo-Lopez J, Scheuermann RH, Schober D, Smith B, Snape J, Stoeckert CJ Jr, Tipton K, Sterk P, Untergasser A, Vandesompele J, Wiemann S (2008) Promoting coherent minimum reporting guidelines for biological and biomedical investigations: the MIBBI project. Nat Biotechnol 26(8):889–896. nbt.1411 [pii]. https://doi.org/10.1038/nbt.1411 CrossRefPubMedPubMedCentralGoogle Scholar
- Tipton KF, Armstrong RN, Bakker BM, Bairoch A, Cornish-Bowden A, Halling PJ, Hofmeyr J-H, Leyh TS, Kettner C, Raushel FM, Rohwer J, Schomburg D, Steinbeck C (2014) Standards for reporting enzyme data: the STRENDA consortium: what it aims to do and why it should be helpful. Perspect Sci 1(1):131–137. https://doi.org/10.1016/j.pisc.2014.02.012 CrossRefGoogle Scholar
- Vempati UD, Schurer SC (2004) Development and applications of the Bioassay Ontology (BAO) to describe and categorize high-throughput assays. In: Sittampalam GS, Coussens NP, Brimacombe K et al (eds) Assay guidance manual. US National Library of Medicine, BethesdaGoogle Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.