The current state of the art of data acquisition in the laboratory is very diverse. A lot of different devices are used, analogue as well as digital ones. Images are recorded and human observations are made (see Fig. 1). Usually all experimental setups and observations are summarized in a handwritten lab notebook, independently from digital or analogue sources. Even plotted results, like chromatograms or spectra, are printed out and glued into such notebooks. The big advantage: all information and data from all the widespread sources are merged in one place. All the different formats from all the heterogeneous systems are homogenized, but unlikely in a very analogue way. This transformation is very error prone, time-consuming and goes along with a huge time delay. Additionally, these lab notebooks are only readable and reusable by the owner; in most cases, other people are not able to get any information out of it in an acceptable time.
A common way to save experimental data next to the lab notebook is on a hard disc or USB. This is a temporary solution and not save; so many people have switched to use cloud server for data storage. This simplifies data sharing with colleagues inside and outside the company and most clouds support access authorization. Without an agreement about the structure or further comments about the context of the data, cloud servers are not usable for researcher groups, especially by the exponentially growth of the data quantity. It should be questioned if this is sufficiently for digitalization and big data analysis without further compliance.
To change the actual and common way of laboratory data acquisition into a digital and modern one, electronic lab notebooks (ELN) can be used [1] (see Fig. 2). With ELNs you can plan experiments, document all devices setups, save digital data according to the experiment and add analogue or human observation manually. A systematic, structured or self-explaining experimental design is saved together with all necessary information about the experiment.
Different initiatives on laboratory automation, like the SiLA (Standardization in Lab Automation) consortium [2], have focused on the connection between sample processing devices and a software system for automation as described by Gauglitz [3]. Additionally, several research and developments have taken place in the area of smart laboratories [4]. The intelligent laboratory of the future is fully digitalized and uses augmented reality and modern human-computer interaction. The facilities and devices are modular with integrated functions for flexible and individual use. Till these techniques find its way into the common reality, the everyday work should be prepared for the future.
For example, by using ELNs, it is easy to share data in defined groups and the reusability of the data is not only given by the experimenter himself. To increase reusability of data, it is helpful to comment all experiments and the according data by metadata. Metadata should include descriptive information about the context, quality and condition, or characteristics of the data. Therefore, metadata are differentiated into four classes: descriptive metadata gives relevant information about the data; structural metadata shows all relationships; technical metadata provides information about the setup or analysis; and administrative metadata gives information about the author, date and confidentiality.
Necessary information in metadata are different depending on the data level. For raw data, other information should be given than for data sets, for analysed data other information is necessary than for published data. For example, row data information may include the experimental protocols, the manufacturer and sensor that created the data or the species used, but analysed data are described by workflows, algorithms, programs and so on. Up till now, there are only some common standards for metadata defined, mainly for distinct data types. In the area of analytical and bioanalytical chemistry, some examples are given for standardized data publishing. For chemical structures, certain formats are defined, together with kinetic information [5]. Mass spectroscopy information and analyses of mass spectroscopy are defined and saved together with structural information in public data bases [6]. For enzyme kinetic [7] or glycomic [8] information, a standard will be defined by the Beilstein Institute soon. All databases are specified for distinct data or focussed on certain points of view. Different databases use different file formats or metadata. This inhomogeneity is exceled by missing standards for processed data, including a description of the workflow that led to the data. More general solutions are needed, when the future requirements of research founding agencies or (open access) journals wanted to be fulfilled.
The analytical community must define these standards for their purpose: which information is obligate, which is optional. A defined format and protocol must be set up, as well as a platform to generate, read and register metadata. All analytical chemists together must go the long way to be prepared for the future requirements of science like open access, digitalization and big data analysis.