Introduction

Colorectal cancer (CRC) and inflammatory bowel diseases (IBD) exhibit shared pathogenetic mechanisms, including chronic inflammation and altered microbiome-mucosa interaction, while lacking sensitive and specific diagnostic biomarkers. Patients with IBD appear to be at notably increased risk of CRC, estimated at 1%, 2% and 5% at 10, 20, and more than 20 years after the initial IBD diagnosis, respectively1. This association is not unexpected, given the persistent hallmarks of cancer within the inflamed intestinal mucosa of IBD patients. These hallmarks involve chronic activation of pro-inflammatory signaling pathways, such as NF-kB, with increased pro-inflammatory and decreased anti-inflammatory cytokines, alterations in immune cell response, including expansion of immunosuppressive T regulatory cells in the inflamed compared to non-inflamed mucosa, and effects of the altered microbiome, alongside changes in apoptotic and autophagic signaling pathways2.

CRC ranks as the third most common cancer in industrialized countries and the fifth in terms of incidence and mortality worldwide3,4. Despite decreasing trends observed in industrialized nations due to screening and improved treatment, developing countries are experiencing a rapid rise attributed to dietary and lifestyle5. Annual CRC detection using fecal occult blood (FOB) in average-risk asymptomatic subjects aged 50–75 years is recommended for reducing mortality6. However, FOB sensitivity and specificity remain suboptimal, even with the more advanced fecal immunochemical test (FIT) which outperforms classical guaiac-based testing (gFOBT)7,8. Enhanced testing methods are crucial for preventing delayed diagnosis and unnecessary endoscopy, and for improving clinical care and resource conservation.

Regular endoscopic surveillance in IBD may facilitate early CRC detection, but compliance issues among patients can impede early CRC diagnosis rate1,9. Diagnosing IBD at onset poses challenges due to its non-specific symptoms, often resembling functional intestinal disorders, hindering effective diagnostic work-up with imaging technology, endoscopy and histology. Although fecal calprotectin might support rule-out and rule-in strategies for further investigations in patients with suspected IBD, its sensitivity and specificity are limited up to 80%10. More sensitive and specific IBD and CRC biomarkers are under evaluation, primarily through genetic, epigenetic, metabolomic, proteomic, peptidomic and metagenomic studies11,12,13.

Although serum is the preferred biological matrix, stools and saliva are also investigated. Saliva components not only reflect oral health, but may also mirror the presence of disease, including cancer, at distant sites14,15,16. Moreover, saliva's non-invasive collection makes it ideal for repeat sampling, especially in vulnerable populations, such as children and the elderly.

This study aims to assess the proteomic and peptidomic profiles of saliva and stool to identify potential CRC and IBD biomarkers. In IBD patients stools, peptides derived from Basic salivary proline-rich protein 1 (PRB1, UniProt: P04280) were identified. These peptides were found to influence proliferation signaling pathways of colorectal cancer cells, highlighting the potential role of salivary proteins in CRC and IBD pathogenesis and diagnosis.

Results

Salivary peptidomic analysis

Peptides in biological fluids could be rapidly and reliably identified by high throughput MALDI-TOF/MS instruments, based on laser radiation and soft ionization of crystallized molecules on a plate. Since MALDI process generates almost exclusively singly charged molecules (features), their mass-to-charge ratio (m/z) corresponds to the mass. MALDI-TOF/MS was employed to analyze saliva samples, and a total of 668 salivary features were identified, of which 74 were significantly different between patient and control groups (Supplementary Table 1). Representative MALDI-TOF/MS spectra found in healthy controls (CS), Crohn’s disease (CD), Ulcerative colitis (UC) and CRC are shown in Supplementary Fig. 1. These spectra depict identified peptides within the mass range of 1000 to 4000 m/z. Each peptide appears as a feature that fall on the x-axis according to its unique m/z value, while the height or intensity of these features on the y-axis corresponds to their relative abundances. The frequency of the 18 most significant (p < 0.0005) features in the different groups is shown in Fig. 1. Among these features, fourteen were observed with greater frequency in patients than in CS, while the remaining four exhibited converse tendencies. Notably, no specific CRC-associated features were identified, whereas peptides 1096 m/z, 1280 m/z, 1480 m/z, 1612 m/z, which were markedly rare in CS, demonstrated apparent correlations with IBD.

Figure 1
figure 1

Most significant saliva features (p < 0.0005). The frequencies of positive findings among CRC and IBD patients in comparison with controls (CS) are shown. Panels (A, B, C) represent features found more frequently in CRC and IBD patients with respect to controls. Panel (D) represents features found more frequently in controls with respect to CRC and IBD patients.

These data highlight the marked enrichment of salivary peptides, which differ in their profiles primarily between IBD and CS cohorts. It should be noted, however, that MALDI-TOF/MS analysis does not allow to discern whether the identified peptides originate from human or non-human sources, particularly those from the oral microbiome. To focus solely on human-derived proteins, subsequent analysis employing LTQ-Orbitrap/MS was conducted.

Salivary proteomic analysis

The salivary proteomic analysis of tryptic digested peptides was performed by LTQ-Orbitrap/MS, which returns their amino acid sequences. A bioinformatic analysis of these sequences is made to match them with the known human proteins. The increasing number of matching peptides covering multiple sequences of any single protein empowers its correct identification. The LTQ-Orbitrap/MS proteomic analysis of the CS, CD, UC and CRC saliva pools allowed the identification of a total of 225 proteins. To enhance the robustness, we selected those proteins identified by means of at least four peptides. This left 152 proteins (Supplementary Table 2), among which 73 were commonly found in CD, UC and CRC patients, but absent in CS. A high percentage (64%, 47/73) of these intestinal disease-associated proteins were not found in our previous salivary analysis of controls and SARS-CoV-2 infected subjects18. These 73 proteins were analyzed by g-Profiler software (p-value < 0.0001) to identify enriched GO-Terms corresponding to Molecular Function (GO:MF), Biological Process (GO:BP) and Cellular Components (GO:CC) (Supplementary Fig. 2). Molecular Functions were mainly related to cell adhesion and cadherin binding, inhibitor/regulator activity of peptidases and endopeptidases. The enriched Cellular Components included exosomes, vesicles, membrane bounded organelle and anchoring junctions. Enriched Biological Process were mainly related to negative regulation of enzymatic activity and to glucose and nucleic acid metabolism.

To identify which of the remaining 79/152 proteins were differently associated with diseases, the abundance ratios found in UC, CD and CRC with respect to controls were calculated and subdivided in five categories based on their belonging to the following percentiles ranges: (1) below 50th (median); (2) 51th–75th; (3) 76th–90th; (4) 91st–99th; (5) above 99th percentile. Fifty-six proteins were classified in the same category in UC, CD and CRC. On the contrary 33 proteins belonged to different categories in different diseases, as detailed in Table 1. Seven proteins were highly abundant in saliva of both CRC and IBD patients, 11 proteins were mainly increased in CRC and 14 in CD patients’ saliva. One protein, Cystatin-SA, increased mainly in the two IBD types, CD and UC.

Table 1 List of 33 salivary proteins with differences in abundance in UC, CD and CRC with respect to CS. The abundance ratios (UC/CS, CD/CS and CRC/CS) are reported. The DAVID (Database for Annotation, Visualization and Integrated Discovery) functional annotation tool was used to define the disease annotation category DisGeNET and/or GAD (Gene-Disease Associations) classification.

To identify whether the disease-associated proteins exhibit specific biological alterations, we selected those with an abundance in CRC and CD of at least five-fold with respect to controls (CRC/CS ≥ 5 and CD/CS ≥ 5) and verified the biological process they clustered in using DAVID tool (Fig. 2). For UC, only five proteins were at least five-fold higher than in controls and no cluster for them was identified.

Figure 2
figure 2

Clusters of biological processes in proteins highly represented in CRC with respect to controls (Left) and highly represented in CD with respect to controls (Right), using DAVID tool.

Overall, these data provide evidence that CD, more than UC and CRC, alters the salivary proteome in agreement with the assumption that this disease is not confined to the intestine. Moreover CD, more than UC, shares with CRC overexpressed salivary proteins involved in critical carcinogenetic pathways, supporting the link between chronic inflammatory diseases and cancer.

Faecal endogenous peptides analysis

To identify potential IBD biomarkers able to distinguish CD from UC, stool endogenous peptides were analyzed by LTQ-Orbitrap LC–MS/MS. The analyses were made using the CS, CD and UC stool pools prepared using individual samples from the second cohort that appeared more enriched in protein bands at SDS-PAGE electrophoresis. The analyzed peptides were endogenous because stool pools were analyzed without previous tryptic digestion and were of human origin, since the bioinformatic analysis was carried out by referring to the human proteome with the exclusion of all non-human sequences.

This analysis revealed the presence of peptides belonging to 30 different proteins. Twenty-six proteins were selected on the basis of a number of matching peptides ≥ 4 (Supplementary Table 3). Four of the 26 proteins were detected in CD stool only and two of them were of salivary origin: Basic Salivary Proline-rich Protein 1, Salivary Acidic Proline-rich Phosphoprotein 1/2 (Fragment), Carboxypeptidase A1 and Immunoglobulin kappa constant. We focused on Basic Salivary Proline-Rich Proteins (PRBs) for which four isoforms are recognized (PRB1, UniProt: P04280; PRB2, UniProt: P02812; PRB3, UniProt: Q04118; PRB4, UniProt: P10163). In both CS and patients’ saliva pools the PRB2 isoform was identified. PRB2 was not included among the detected salivary proteins (Supplementary Table 2) because only two instead of four trypsin digested peptides were identified, despite the high Peptide Spectrum Match value of 85. One of the two PRB2 trypsin digested peptides found in saliva matched exactly with the GQ-15 and partially with the GG-17 unique endogenous peptides belonging to the PRBs family found in stool (Table 2).

Table 2 Basic salivary proline-rich protein 2 (PRB2, UniProt: P02812) unique peptide 2 identified in saliva and stool.

GQ-15 (1495.66 m/z) and GG-17 (1578.75 m/z) were synthesized to enable in vitro studies to verify whether they affect cell proliferation and the related signaling pathways Akt, Erk 1/2 and p38, using the colorectal cancer cells lines HT-29 and HCT-116. Initial exposure of HT-29 and HCT-116 cells to varying concentrations (0.1 and 100 nM) of GQ-15 and GG-17 was conducted to assess potential dose-dependent effects on cell growth. The percentage of cells/mL counted after 48 h of stimulation relative to the number of unexposed cells/mL counted at 24 h of culture was depicted (Fig. 3). Data were compared with epidermal growth factor (EGF) stimulation. Notably, differential sensitivities were observed in the two cell lines; while HCT-116 cells displayed no significant response, HT-29 cells exhibited significant dose-dependent stimulation of cell growth, particularly evident with GG-17. Subsequent experiments were performed using the higher peptides concentration at 100 nM. GG-17 emerged as a powerful stimulant of HT-29 proliferation, being even more effective than EGF. This pro-proliferative effect of GG-17 was observed both when the peptide was added alone (Fig. 4A) or in combination with EGF (Fig. 4B). Conversely, the proliferation of HCT-116 cell line remained unaffected by EGF, GQ-15, or GG-17 (Fig. 4C and 4D).

Figure 3
figure 3

Dose-dependent proliferation experiments with HT-29 (A and B) and HCT-116 (C and D) cell lines. Cells were stimulated with EGF 100 ng/mL, GQ-15 at 0.1 nM and 100 nM, and GG-17 at 0.1 nM and 100 nM synthetic peptides both added alone (panels A and C) or combined (panels B and D) and counted after 24, 48 and 72 h. The percentage of cells/mL counted after 48 h of stimulation relative to the number of unexposed cells/mL counted at 24 h of culture is shown. One-way ANOVA: Panel A, F = 5.163, p = 0.010; Panel B, F = 0.840, p = 0.484; Panel C, F = 0.642, p = 0.597; Panel D, F = 1.441, p = 0.258. * = p < 0.05 by Tukey’s multiple comparison test.

Figure 4
figure 4

Proliferation experiments with HT-29 (A and B) and HCT-116 (C and D) human colorectal cancer cell lines. Cells were stimulated with EGF 100 ng/mL, GQ-15 100 nM and GG-17 100 nM synthetic peptides both added alone (panels A and C) or combined (Panels B and D) and counted after 24, 48 and 72 h. Repeated measures analysis of variance: (A) time F = 51.63, p < 0.0001; treatment: F = 1.54, p = 0.221; interaction F = 1.63, p = 0.115. (B) Time F = 66.33, p < 0.0001; treatment: F = 2.16, p = 0.109; interaction F = 2.22, p = 0.026. (C) Time F = 141.60, p < 0.0001; treatment: F = 0.042, p = 0.988; interaction F = 0.297, p = 0.974. (D) Time F = 126.80, p < 0.0001; treatment: F = 0.088, p = 0.966; interaction F = 0.311, p = 0.970. Tukey’s multiple comparison test: * p < 0.001 with respect to control (panel A and B) and p < 0.05 with respect to GQ-15 (panel A) or EGF + GQ-15 (panel B) at 48 h. Each point and line report mean values of ten replicated measures. SD for graphical clarity is not shown.

To further elucidate the underlying mechanisms, the RAS/RAF/MEK/ERK (MAPK) and PI3K/AKT/mTOR signaling pathways were evaluated via Western blot analysis of phosphorylated Erk 1/2, p38 and Akt of HT-29 and HCT-116 cells treated or untreated with GQ-15 or GG-17 peptides (Fig. 5). In HT-29 cells, both GQ-15 and GG-17, similarly to EGF, induced Erk 1/2 phosphorylation. The p38 phosphorylation was induced by the GG-17 peptide, not GQ-15, and by EGF alone or combined. Akt phosphorylation remained unaffected by the two peptides at both phosphorylation sites, whereas EGF treatment caused a reduced Akt308 phosphorylation. In HCT-116 cells GQ-15 induced Erk 1/2 and p38, while it reduced Akt phosphorylation; in this cell line GG-17 exerted an opposite behavior, inducing Akt but not Erk 1/2 and p38 phosphorylation. Co-stimulation with EGF modulated the signaling effects of both GQ-15 and GG-17 peptides, reducing the effect of GQ-15 while enhancing those of GG-17 on Erk 1/2, and amplifying the effect of GQ-15 on Akt.

Figure 5
figure 5

Western blot analyses of HT-29 and HCT-116 cell lines. Both cell lines were stimulated with EGF 100 ng/mL, GQ-15 100 nM, GQ-15 100 nM + EGF 100 ng/mL, GG-17 100 nM, GG-17 100 nM + EGF 100 ng/mL. Proliferation-related targets Akt, Erk 1/2 and p38 are shown. The results of any target were cropped from different blots. Full-length blots are included in the Supplementary Information file.

Peptides derived from salivary PRBs detectable in stool are biologically active, able to stimulate colorectal cancer cell growth and the key pro-oncogenic signaling pathways, namely RAS/RAF/MEK/ERK (MAPK) and PI3K/AKT/mTOR.

Discussion

The primary objective of this study was to identify novel non-invasive biomarkers potentially applicable to the diagnosis of CRC and IBD by analyzing salivary and fecal proteins/peptides. Fecal samples are already used as diagnostic tool for CRC and IBD via fecal occult blood and calprotectin assays. The selection of saliva as a potential source for biomarker exploration stemmed from its non-invasive collection process, which can be self-collected, thereby fostering patients' compliance19. Moreover, in contrast to serum or plasma, saliva presents a lower concentration of abundant proteins like albumin and immunoglobulins, that might mask less abundant biomarkers and thus facilitating the proteomic analysis20. Saliva has been extensively investigated in various cancer contexts, including breast cancer, culminating in the concept of “salivaomics” encompassing proteomics, metabolomics, and transcriptomics, yielding promising outcomes19. Saliva, being continuously generated and traversing the entire gastrointestinal tract21, might harbor components pertinent to the initiation or progression of gastrointestinal tract inflammation and/or cancer.

In the present study, a substantial abundance of salivary peptides were identified by MALDI-TOF/MS in both patients and controls, in agreement with findings previously reported by Tong et al.22. Nevertheless, within these highly enriched peptides profiling, no discernible feature specifically associated with CRC was delineated.

Proteomic analysis by LTQ-Orbitrap/MS successfully identified a high number of human proteins in saliva, as expected20. Non-human proteins, from the microbiome, were excluded from the bioinformatic analysis. A high number of proteins (n = 73) were correlated with colorectal diseases, being absent in controls, but present in IBD and CRC saliva. This correlation was further supported by the noteworthy observation that 64% (47/73) of these proteins were absent in saliva from a cohort of previously investigated patients devoid of intestinal diseases18. Notable proteins linked to colorectal diseases encompassed Annexins, Heat shock proteins, Histones and glycolytic enzymes. These findings are congruent with the documented association with glycolytic process, stress response, and cellular metabolism, as elucidated through the GO-Terms analysis of the biological processes (see Suppl. Figure 2), and further supported by the fact that aerobic glycolysis is a hallmark of cancer23. Among the identified salivary proteins, a subset of 79 found in both patients and controls, varied in abundance. Remarkably, within this subset, certain proteins exhibited significantly elevated levels in CRC cases. Notably, some of these proteins, including BPI-fold containing family B member 2 (BPIFB2; UniProt: Q8N4F0) and Kaliocin-1 (Fragment, UniProt: E7EQB2) have been previously implicated in CRC. Namely, BPIFB2 protein, highly enriched in salivary glands, demonstrates expression across different tumor types, including colorectal cancer (www.proteinatlas.org/ENSG00000078898-PIFB2/pathology/colorectal+cancer), and Kaliocin-1, also known as Lactotransferrin-1 (LTF), produced by exocrine glands, exerts an anti-bacterial and anti-cancer activity through cell cycle arrest, induction of apoptosis, immune modulation and inhibition of migration and invasiveness24. Overall, the analysis revealed that proteins associated with colorectal cancer predominantly clustered within processes involved in DNA stability maintenance. However, this main cluster was not CRC exclusive, as it also overlapped with the main protein cluster associated with CD. Additionally, minor clusters of inflammation, innate and adaptive immunity exhibited shared characteristics between CRC and CD. These results imply shared biological alterations, mainly inflammation, immunity, and control of DNA replication between cancer and chronic inflammatory disorders. Such findings are in line with those reported by Zheng et al.25, who analyzed salivary exosomal proteins in IBD and identified altered processes encompassing inflammation, innate and adaptive immune response, DNA binding and proteasomes activity when compared to controls. Two CD-associated proteins identified in this study, Peroxiredoxin-1 and Peroxiredoxin-2, appear of interest for their primarily involvement in oxidation processes. A differential expression of proteins involved in oxidation/reduction have been documented among IBD patients, including an upregulation of Peroxiredoxin-2 in blood mononuclear cells, which imply a potential migration of these cells to the oral mucosa, as indicated by Hatsugai et al.26. Moreover, heightened antioxidant activity alongside increased oxidative damage to proteins and lipids has been described in the saliva of patients with BRCA1 mutations and breast cancer, suggesting a potential association between altered oxidative-antioxidative balance and oral manifestations27.

Furthermore, a comprehensive assessment of proline-rich proteins (PRPs), constituting the primary components of human saliva28, was undertaken. This evaluation was prompted by the indication of endogenous peptides derived from basic salivary proline-rich protein 1 (PRB1) in stool samples from CD patients. The presence of PRB1 peptides in stool may stem from salivary protein fragmentation during transit through the gastrointestinal tract. However, an alternative possibility exists wherein these peptides originate directly from saliva and traverse the gastrointestinal tract without undergoing further degradation. This hypothesis is supported by the observation that proteolysis of PRPs can occur during or prior to granule maturation, implying that resulting peptides are inherent constituents of saliva29. An increase in PRPs-derived peptides, observed in the saliva of patients with different diseases such as asthma30, appears to exert significant biological effects. Pinto da Costa et al. demonstrated that a PRPs-derived peptide (GPPPQGGRPQG) compromises the stabilization of adherent junctions, such as E-cadherin, thereby inducing apoptosis31. Interestingly, this peptide sequence shared the first seven amino acids with the GG-17 peptide identified by us in the stool of CD patients. The behavior of PRPs in the context of CRC and IBD in saliva was further investigated. Basic salivary proline-rich protein 2 was found to be less abundant in IBD and CRC patients than in controls, suggesting an elevated level of degradation and peptide release within the oral cavity of patients with intestinal diseases. In agreement, lower levels of a salivary peptide at 1226.7 m/z from PRPs have been described in patients undergoing hemodialysis for chronic kidney disease compared to controls22.

To assess the potential biological effects of salivary-derived peptides, a series of in vitro studies were conducted by stimulating two human colorectal cancer cell lines with the PRBs-derived peptides, specifically GQ-15 and GG-17. Notably, the HT-29 cell line exhibited heightened responsiveness to GG-17 stimulation, eliciting dose-dependent cell proliferation. Furthermore, GG-17 induced phosphorylation of Erk 1/2 and p38, critical downstream targets of the pro-oncogenic RAS/RAF/MEK/ERK (MAPK) signaling pathway. GQ-15, similarly to GG-17 and EGF, triggered Erk 1/2, but not p38, phosphorylation, but it did not stimulate cell growth. In contrast, the response of HCT-116 cells to PRPs-derived peptides and EGF stimulation differed significantly from that of HT-29. Proliferation remained unaffected by any molecule or combination thereof. Moreover, while GQ-15, mainly if combined with EGF, induced Erk 1/2 and p38 phosphorylation, GG-17 solely activated Erk 1/2 without impacting p38. Moreover, PRBs peptides induced Akt phosphorylation at both 308 and 473 sites in HCT-116 cells, particularly in combination with EGF. The different behaviors observed in these two cell lines might depend on their distinct mutational statuses: HT-29 is wild-type for KRAS and harbors the p.R273H p53 mutation, while HCT-116 carries the KRAS codon 13 mutation and is wild-type for p5332. Furthermore, the different sensitivity of these two cells lines to EGF stimulation might also be influenced by their differential expression levels of the EGF receptor, reported to be lower in HCT-116 compared to HT-29 cells33.

These findings indicate that PRBs naturally derived salivary peptides have the potential to traverse the intestinal mucosa and stimulate cancer cell growth by activating pro-proliferative signaling pathways.

In conclusion, our examination of the salivary proteome unveiled a cluster of proteins shared between CRC and IBD, involved in DNA stability maintenance, innate, and adaptive immunity. Notably, reduced PRPs saliva levels in CRC and IBD coincide with the enrichment of two PRB-derived peptides are enriched in CD stool. These peptides stimulate in vitro CRC cell proliferation and activate signaling pathways involving Erk1/2, Akt and p38, thus suggesting a potential involvement of PRPs in the pathogenesis of IBD and cancer.

Methods

Studied cohorts

Two different cohorts of subjects were enrolled. The first cohort contributed saliva samples for peptidomic and proteomic analysis, while the second cohort provided stool samples for proteomic examination.

The first cohort included: 20 healthy controls (CS, 10 females, 10 males; mean age ± SD: 40.0 ± 10.2 years), 25 IBD (12 Crohn’s Disease, CD; 13 Ulcerative Colitis, UC; 10 females, 15 males; mean age ± SD: 47 ± 12.9 years) and 37 CRC (20 females and 17 males; mean age ± SD: 66.0 ± 14.7 years). The second cohort included: 51 CS (19 females, 32 males; mean age ± SD: 50.0 ± 8.7 years) and 151 IBD (94 CD and 57 UC; 61 females, 90 males; mean age ± SD: 48.1 ± 15.8 years).

CS were healthcare personnel undergoing routine visits, without any sign of intestinal symptoms or a history of intestinal disease. CRC and IBD patients attended the Surgical Clinic and the Gastroenterology Unit of the University-Hospital of Padova.

This work was approved by our local ethics committee (University-Hospital of Padova, Prot. N. 3756/AO/16) and a fully informed consent was obtained in writing from all participants. The procedures used in this study adhere to the tenets of the Declaration of Helsinki. All methods took place in accordance with relevant guidelines and regulations.

Saliva and stool collection and processing

Saliva samples were collected in the morning before breakfast using Salivette® (Sarstedt AG & Co., Germany), centrifuged for 5 min at 4500 rpm and stored at -80 °C. Fecal samples, weighing 100 mg, were collected from the first bowel movement of the day, diluted in 100 µL of ultrapure water, centrifuged for 15 min at 20,000 g, and the supernatants stored at -80 °C17.

Saliva MALDI-TOF/MS peptidomic analysis

Two hundred µL of saliva were prepared as detailed in Supplementary methods to perform peptidomic MALDI-TOF/MS analyses on a 4800 Plus MALDI-TOF/TOF Analyzer (AB Sciex, Canada) in reflector mode. Using MALDI-TOF/MS, crystalized samples are irradiated through pulsing laser beam that induces soft ionization of the molecules. The generated ions are accelerated in the flight tube under vacuum to reach the detector. The length of time to pass through the flight tube (Time of flight) is inversely correlated with the molecular mass of the ionized molecules and the generated ion signals are expressed as mass to charge ratio (m/z) with intensity correlated with abundance. Since generated ions are primarily single protonated molecules, the multiple charged molecules being very low in abundance, the m/z features of MALDI-TOF/MS spectra should be regarded as molecular weights34.

Saliva and Stool LTQ-Orbitrap LC–MS/MS proteomic analysis

Saliva and fecal samples were subjected to SDS-PAGE electrophoresis to select samples which appeared more enriched in protein bands to prepare pools as detailed in Supplementary methods. Eight individual saliva samples from any patients’ and control groups were selected and pooled to obtain the CS pool (0.5 μg/μL final protein concentration), the UC pool (1.5 μg/μL), the CD pool (3.0 μg/μL) and the CRC pool (3.0 μg/μL). Similarly, three pools of fecal samples were prepared: the CS pool made with 6 samples (5 μg/μL), the UC pool made of 5 samples (4.4 μg/μL) and the CD pool made of 5 samples (4.6 μg/μL).

The proteomic analysis of saliva was performed using SDS-PAGE slices subjected to trypsin in-gel digestion. Stool endogenous peptides were directly extracted from stool pools following the FASP protocol for protein digestion. Both tryptic salivary peptides and stool endogenous peptides were analyzed with an LTQ-Orbitrap XL mass spectrometer (Thermo Fisher Scientific, US) after they have been treated as detailed in Supplementary methods. Data were analyzed with the Proteome Discoverer software (version 1.4, Thermo Fisher Scientific, US) interfaced to a Mascot server (version 2.2.4, Matrix Science, UK) and searched against the human section of the Uniprot Database (www.uniprot.org, version 20,150,401, 90,411 sequences), excluding all non-human proteins. Consequently, peptides matching with proteins of animal, plant, microbial and other non-human origins did not enter the analysis.

Peptides synthesis

The Basic salivary proline-rich protein 1 (UniProt: P04280) peptides GQ-15 (NH2-GNQPQGPPPPPGKPQ-COOH, molecular weight: 1495.66 m/z) and GG-17 (NH2-GPPPQGGNKPQGPPPPG-COOH, molecular weight: 1578.75 m/z) were synthetized by Primm srl (Italy). Lyophilized powder peptides were resuspended in 2 mL of PBS 1X (concentration of 2 mg/mL). To avoid freeze–thaw cycles, 20 µL of aliquots were stored at − 80 °C.

In vitro cell proliferation and signaling experiments

Two human CRC cell lines, HT-29 and HCT-116, were used to study cell proliferation (Automated Cell Counter, Life Technologies, US) and the proliferation signaling pathways p38, Erk 1/2 and Akt by western blot analyses as detailed in Supplementary methods. Cells were left unstimulated or were stimulated with a positive control (EGF 100 ng/mL) or with the GQ-15 100 nM and GG-17 100 nM peptides.

Statistical analysis

Statistical analyses were conducted utilizing STATA (US), and R software packages for statistical computing (US). For the evaluation of saliva peptidomic features, Fisher’s exact test was used with Benjamini-Hockberg adjusted p-value. Differences between groups were assessed using the Z-test. Proliferation experiments were subjected to analysis using two-way repeated measures analysis of variance (ANOVA), with a significance threshold set at p < 0.05.

All methods were carried out in accordance with relevant guidelines.

Informed consent was obtained from all participants (University-Hospital of Padova, Prot. N. 3756/AO/16).