The high numbers of hydrolysate components and their dynamics make it difficult to predict which peptide is interesting to analyze in relation to a research question and thus to predefine a strategy for a targeted analysis. Moreover, preconceptions about the possible outcome of a study can lead to a biased setup and unawareness of the actual coverage of the total potential outcome. Therefore, we have implemented the analytical strategy to first map data sets by performing non-targeted LC-MS analysis followed by non-targeted data analysis. Subsequently, targeted data analysis can reveal trends for individual (groups of) components and, finally, targeted analyses can be defined and performed to obtain quantitative data. Below, we present the results of this novel strategy to analyze hydrolysates in vitro and in vivo, applied to collagen hydrolysates, going from non-targeted towards targeted analysis, illustrating the analytical workflow and possible considerations.
Non-targeted data analysis
Start products, TIM dialysates, and serum samples were analyzed by LC-MS as described in the Materials and methods section. Chromatograms were converted to MS spectra. MS data were exported as intensities per nominal m/z value. A solvent blank was used to correct for solvent background signals . Division by the background signal provides smoother data than background signal subtraction. All signals related to solvent will give a value close to 1 and all signals related to the products will provide values > 1. The data used to correct for background signals is referred to as correction reference in the remaining part of the text and the outcome of the division of sample data by the correction reference data as signal ratios. An advantage of this approach is that a quick and global view of the similarities and differences between samples is obtained. In addition, the data have semi-quantitative properties (see Table 3). Especially in the cleaner matrices (product and TIM dialysate), the signal ratio sums show an overall good relation with the spiked concentration, except for the TCL sample. For serum, the relation with concentration seems to be less linear and this might be related to effects of the more complex matrix background and thus relatively lower product-related signals. We proceeded with assessing the similarity between the data of all the samples. To illustrate the nature of the sample comparisons, in Fig. 1, plots are presented of the signal ratios between a different set of samples (tryptic digests of collagen samples), showing an expected decreasing correlation with decreasing similarity. The correlation coefficients between the signal ratios per nominal m/z value, which do not depend on the product concentration, were calculated for each comparison and the individual coefficients were organized in a correlation matrix. In Fig. 2, the correlation coefficients between start product, TIM dialysate, and serum (average of 3 subjects per product) data are reported. All spikes and especially the high spikes correlate very well with the corresponding start product showing good analytical performance in each matrix. TIM dialysate samples did not have a high correlation with the corresponding start products which is expected due to the processing of components in the simulated gastrointestinal tract. This is especially true for the higher average molecular weight (MW) product C. TIM dialysate of product C exhibits more correlation with the shorter average MW products, which indicates that product components become more similar when the average MW becomes lower, as has been discussed in the Introduction and in the ESM. Although there is an underlying matrix background, in Fig. 3, the decrease in average MW is illustrated by the obtained signal ratios between m/z 51 and 500 divided in 23 smaller m/z ranges. Figure 3 clearly shows that the start products have a higher relative intensity at higher m/z values and serum samples at lower m/z values, while TIM dialysate samples are in between. The latter findings confirm that TIM is a suitable in vitro model to predict luminal gastrointestinal processing of ingested compounds in humans. From Fig. 2, it appears that the average MW has a more profound effect on similarity than the source animal (mainly for start product and TIM dialysate). Serum samples are similar to neither start product nor TIM dialysate samples, most probably due to further processing, selective uptake, and serum background. Because the same correction reference (solvent blank) was used to calculate all signal ratios, the comparison of serum samples with the other sample types is suboptimal. When samples in the same matrix are compared, it is possible to calculate signal ratios using a matrix-specific correction reference, such as blank TIM dialysate for TIM dialysate samples, to improve the comparison quality. For serum samples, which show high biological variation between individuals, in time and depending on circumstantial factors, it could be justified to use pooled serum or the minimum observed values in a set as correction reference.
Targeted data analysis
Quantitative LC-MS measurements and unambiguous structure confirmation often require either suitable internal standardization, external standardization, or standard addition for each analyte of interest. When targeted data analysis is performed on a non-targeted data set, the quantification issue will arise again, because the data are then considered on an individual component basis. An intermediate form of targeted data analysis can be to consider component groups. To be able to relate signals to start product components and express them as product concentration equivalent, samples spiked with product can be analyzed. Spikes at different concentration levels can be used to assess matrix effect and linearity in the spiked concentration range, on an individual component basis, but also on component group basis. Using this approach, it can be investigated whether components are more abundant before or after processing in a semi-quantitative fashion.
In the data set, we observed components which provided signal in the start product and in the spiked samples but not in the TIM dialysate. These components are most probably not bioaccessible or they are degraded in the TIM system. In addition, there are components which provided a signal in the TIM dialysate but not in the start product. The latter components are most probably formed by digestion in the TIM system and they are bioaccessible. As an example, nominal m/z 357 is considered (see Table 4). The dominant signal in this channel originates from m/z 357.213, which represents the tetrapeptide AGJP (and isomers, where J stands for (iso)leucine). In theory A, P, and J could be removed from AGJP during digestion to form collagen tripeptides. Therefore, nominal m/z 286, 260, and 244, dominated by respective signals at 286.176, 260.161, and 244.129, were also considered. The four products exhibit different component landscapes and it should be noted that m/z 357 is also an intermediate which can be formed from larger peptides. It is assumed that for each start product, the component landscapes merely shift through the presently recorded state and will become more similar regarding MW distributions when they end up in the blood, as illustrated by Fig. 3.
For collagen hydrolysates, peptides containing hydroxyproline are of special interest as hydroxyproline is a characteristic amino acid in collagens and because Hyp-containing di- and tripeptides have been reported to carry specific bioactivity that might relate to health benefits of collagen peptide supplementation. The non-targeted data gave the impression that short peptides containing hydroxyproline provided either a low signal or that hydroxyproline might be predominantly present in longer peptides, because many theoretically abundant short peptides containing hydroxyproline, such as p, pG, Ap, Pp, GAp, GPp, ApG, PpG, pGA, pGE, and pGP, were not observed with high intensity in the non-targeted data. The latter hypothesis (Hyp in longer peptides) is less probable, because we assumed that through the action of proteolytic enzymes and due to the enzymatic and chemical processing which takes place in the TIM system, also short peptides containing hydroxyproline should be formed. Therefore, we decided to perform a targeted analysis. A solution to enhance signals of hydroxyproline(-containing short peptides) is to derivatizate with AccQ-Tag. This reagent binds to amine groups of amino acids and short peptides, is easily protonated, and enhances the chromatographic properties.
Total Hyp (after hydrolysis) and the theoretically most abundant Hyp-containing dipeptides from collagen, Pp, and pG were analyzed in a targeted fashion in the serum samples. For Hyp, a bioanalytically acceptable calibration curve was obtained, meaning that the back-calculated analyte concentrations of 75% of the calibration samples were within ± 15% (± 20% at the lowest calibration level) of the theoretical value, in the 1–200 μg/ml range after correction for the level found in the pooled human serum. It is important to note that at t = 60 min the mean increase in total Hyp concentration was approximately 14.5 μg/ml (110 nmol/ml). The dipeptides pG and Pp which could be carriers of Hyp to the blood were determined at t = 0 and t = 60 min. The mean pG concentration increased from less than 0.04 μg/ml to 0.66 μg/ml after 1 h (increase 3.5 nmol/ml) and the mean Pp concentration increased from 0.23 to 2.3 μg/ml after 1 h (increase 9.1 nmol/ml). It can be concluded that, in this experimental setup, pG and especially Pp contribute significantly as carrier to the total Hyp increase in blood after ingestion of collagen hydrolysate. The values determined for total Hyp, pG, and Pp in this study fit with values reported earlier in literature [27,28,29,30]. The previously reported values, however, vary to a great extent, regarding the ratio between free Hyp and peptide-bound Hyp and the relative concentrations of different Hyp carriers. After ingestion of collagen hydrolysate, Yazaki et al.  mainly found the Hyp carriers GPp in murine plasma and mainly Pp in skin. Similarly, Iwai et al.  mainly found Pp and lesser amounts of Ap, ApG, PpG, Lp, Ip, and Fp in human serum and plasma. The major component that Ichikawa et al.  found in human plasma was Pp and minor components were ApG, SpG, Ap, Fp, Lp, Ip, GPp, and PpG. Finally, Taga et al.  particularly found increased XpG (where X represents any amino acid) in murine plasma when they administered a gelatin hydrolysate that was produced using a cysteine-type ginger protease. Differences in the findings of (previous) studies can be the result of (minor) differences in the experimental set up. In the present study, a complete and unique picture was obtained by LC-MS analysis of the dissolved start products, in vitro-generated dialysates (containing the digested components that are potentially available for small intestinal absorption), and human serum collected after product ingestion. We have shown that the four tested collagen hydrolysates exhibit different component landscapes during the course of digestion and absorption. There are many possible Hyp carriers: several tripeptides and other dipeptides . The abundance of particular carriers will depend on the extent of hydrolysis of the start product and the subsequent in vivo processing, e.g., the chemical and enzymatic processing which takes place in the gastric and intestinal compartments, as shown by our in vitro-generated TIM data. Finally, also brush border enzyme activity and selective transfer to the blood compartment will play a role. Targeted analysis can be very helpful to investigate the concentration of Hyp carriers. When there is a limited number of analytes to be determined, acceptable standardization can be achieved through synthesis of proper (internal) standards at acceptable costs.