Introduction

The mid 1970s saw a major advance in the field of mass spectrometry with the first experiment, reported by the Cooks’ young group at Purdue University, using chemical ionization coupled to the mass-analyzed ion kinetic energy spectrometry (MIKES) technique for direct and rapid analysis of complex natural mixtures [1]. Later named MS/MS by Fred McLafferty [2], this technique involved (1) soft ionization of analytes to preserve their structural integrity, (2) separation of individual components of the ion mixture in a magnetic sector (B), and (3) recording of product ions generated by spontaneous dissociation of selected metastable ions using the electric sector (E) of the instrument [3]. Product ions are related to the structure of the dissociating precursor, and their analysis hence allows insights in the connectivity of functional groups of the original molecule. Performing a non-chromatographic separation of mixture components, after rather than before the ionization step, allowed a mass spectrum to be rapidly obtained for individual species of complex natural samples [47]. A variety of instruments where two mass analyzers are assembled in tandem were then developed and permitted to circumvent limitations of the early BE instruments, amongst which is the triple quadrupole mass spectrometer [8] still popularly used nowadays. In the product ion scan mode, the first quadrupole is used for selective sampling of targeted ions from the ionization source, which ensures that ionic species mass analyzed in the third quadrupole all arise from the dissociation, in the second quadrupole, of the sole selected precursor ions. Since then, a variety of instrumental configurations and activation techniques have been developed [911], but tandem mass spectrometry remains the cornerstone in any work related to molecular characterization.

One of the major breakthroughs allowed by tandem mass spectrometry is biopolymer sequencing, where the nature and relative location of building units can be retrieved by analyzing MS/MS data with regard to specific fragmentation rules. Indeed, because their dissociation occurs via backbone cleavages independently of their sequence and since they are composed of building units of known mass, biopolymers such as peptides [12], oligonucleotides [13], or carboxydrates [14] exhibit typical MS/MS patterns that allow their sequence to be reconstructed. MS/MS was also shown to be extremely valuable for structural characterization of synthetic polymers. While homopolymers obviously do not need to be sequenced since the same monomeric unit repeats along the entire chain, MS/MS is highly valuable to differentiate block from random copolymers composed of more than one type of repeating units [1520]. However, only a few examples of MS/MS sequencing of sequence-defined copolymers have been reported so far because analysis of such data might not always be performed by simply combining rules reported for the corresponding homopolymers [2123].

In contrast, MS/MS was evidenced to be the key technique to retrieve digital information stored in sequence-controlled non-natural polymers. It was reported in very recent years that binary information can be stored in synthetic polymer chains using two co-monomers defined as 0- and 1-bits [2426]. Such monomer-based digital sequences are synthesized by solid-phase iterative chemistry, using either conventional approaches or more straightforward protecting-group-free methods [27]. For example, the Lutz group has reported the synthesis of digitally-encoded poly(phosphodiester)s [28, 29], poly(alkoxyamine phosphodiester)s [30], poly(alkoxyamine amide)s [31, 32], poly(triazole amide)s [33, 34], and polyurethanes [35]. For the latter four classes of polymers, it was shown that digital information can be accurately and rapidly deciphered by MS/MS sequencing [30, 3538]. Overall, similar to biopolymers, these macromolecules are monodisperse, exhibit a controlled sequence of monomers, and their fragmentation pattern strongly depends on the chemistry of the polymer backbone. However, in great contrast to biopolymers exhibiting biological functions dictated by their structure (which in turn dictates their dissociation once activated), the main purpose of information-containing synthetic polymers is to be read, which offers a unique opportunity to tailor their structure in order to achieve the best MS/MS readability. This new concept, where MS/MS data obtained for a given synthetic species are used to further optimize its structure in order to simplify its sequencing, is illustrated herein with two examples, namely poly(triazole amide)s and poly(alkoxyamine phosphodiester)s.

Material and Methods

Chemicals

For polymer synthesis: 1,11-dibromoundecane (≥98%), trifluoroacetic acid (TFA, 99%), dichloromethane (DCM, ≥99.9%, stabilized with amylene), 5-hexynoic acid (97%), 1-hydroxybenzotriazole hydrate (HOBt, ≥97%), N,N′-diisopropylcarbodiimide (DIC, 99%), 4-(dimethylamino)pyridine (DMAP, 99%), and N,N-dimethylformamide anhydrous (DMF, 99.8%) were used as received from Sigma Aldrich (St. Louis, MO, USA). Wang resin (0.22 mmol g–1) from Iris Biotech GmbH (Marktredwitz, Germany), as well as N,N'-dicyclohexylcarbodiimide (DCC, 99%) and 4'-di-n-nonyl-2,2'-bipyridine (dNbipy, 97%) from Alfa Aesar (ThermoFisher GmbH, Karlsruhe, Germany) were used without further purification. Copper(I) bromide (CuBr, 98%, Sigma Aldrich) was purified by stirring in acetic acid overnight, washing with ethanol, and drying under vacuum at room temperature. For polymer analysis: samples (a few mg) subjected to electrospray ionization (ESI) were first solubilized in methanol (SDS, Peypin, France) and then diluted (1/10 to 1/1000, v/v) in methanol supplemented with ammonium acetate (Sigma Aldrich) at a 3 mM concentration level. Poly(ethylene glycol) (PEG) and poly(methymethacrylate) (PMMA) used as internal standards for accurate mass measurements were from Sigma Aldrich.

Synthesis of 1-Amino-11-Azido-Undecane

The synthesis of 1-amino-11-azido-undecane was performed following a published protocol but using 1,11-dibromoundecane instead of 1,12-dibromododecane as a starting reagent [39]. The first intermediate 1-bromo-11-azido-undecane was purified by column chromatography on silica gel using n-pentane as eluent and was recovered as a colorless oil in 52% yield. The second intermediate 1-phthalimido-11-azido-undecane was purified by column chromatography on silica gel using dichloromethane as eluent and was recovered as a white solid in 98% yield. The last step was done as described in the publication. 11-Azidoundecan-1-amine was recovered as a colorless oil in 88% yield. 1H NMR (400 MHz, CDCl3, δ, ppm): 1.07 (s, 2H, -NH2), 1.20-1.42 (m, 16H, -(CH 2)8-), 1.55 (m, 2H, -CH 2-CH2-N3), 2.63 (t, 2H, -CH 2-NH2), 3.21 (t, 2H, -CH 2-N3). 13C NMR (100.6 MHz, CDCl3, δ, ppm): 26.76 (-CH2-CH2-CH2-N3), 26.94 (-CH2-CH2-CH2-NH2), 28.88, 29.18, 29.50, 29.52, 29.62, 33.96 (-CH2-CH2-N3), 42.34 (-CH2-NH2), 51.53 (-CH2-N3).

Oligo(triazole amide)s

Sequence-defined poly(triazole amide)s were synthesized using a general solid-phase orthogonal concept that was initially reported in reference [40]. This approach involves successive amidification and copper-assisted alkyne-azide cycloaddition (CuAAC) coupling steps. The tri(ethylene oxide)-containing poly(triazole amide)s were synthesized using a dyad ligation strategy that was described in a previous publication [34]. The undecyl-containing poly(triazole amide)s were obtained via a standard monomer-by-monomer strategy, on a Wang resin (0.22 mmol g–1) [33, 38]. First iteration was performed in DCM using 6 eq of 5-hexynoic acid, DCC, and 3 eq of DIPEA (r.t., overnight). CuAAC was performed using 1-amino-11-azido-undecane (6 eq) in the presence of CuBr (3 eq) and dNbipy (6 eq) in anhydrous DMF (argon atmosphere, 50 °C, overnight). The amidification step was conducted with 5-hexynoic acid (6 eq) in the presence of HOBt (6 eq) and DIC (6 eq) as coupling agents in DMF (r. t., 2 h). Amidification and CuAAC steps were repeated a certain number of times until a desired oligomer length was reached. Afterwards, the oligomer was cleaved from its solid support by TFA/DCM (1/1) treatment (r.t., 4 h).

Poly(alkoxyamine phosphodiester)s

The sequence-coded poly(alkoxyamine phosphodiester)s were synthesized by solid phase chemistry using successive phosphoramidite and radical–radical coupling steps, as recently described [30].

Nuclear Magnetic Resonance (NMR)

1H NMR (400 MHz), 13C NMR (100.6 MHz) spectra were recorded in CDCl3 on a Bruker Avance 400 spectrometer equipped with Ultrashield magnet (Karlsruhe, Germany).

Mass Spectrometry

High resolution MS and MS/MS experiments were performed using a QStar Elite mass spectrometer (SCIEX, Concord, ON, Canada) equipped with an ESI source, which was operated in the positive ion mode in the case of poly(triazole amide)s (capillary voltage: +5500 V) and in the negative ion mode for poly(alkoxyamine phosphodiester)s (capillary voltage: –4200 V). In this instrument, air was used as nebulizing gas (10 psi) while nitrogen was used as curtain gas (20 psi). Sample solutions were introduced in the ionization source at a 10 μL min–1 flow rate using a syringe pump. Ions were measured using an orthogonal acceleration time-of-flight (oa-TOF) mass analyzer operated in the reflectron mode. In the MS mode, internal calibration was performed with two ionized adducts of a synthetic polymer (either PEG or PMMA) selected to bracket the targeted analyte m/z value [41]. In CID experiments, precursor ions were selected in the quadrupole mass analyzer, injected into the collision cell (collision gas: nitrogen), and products ions were measured in the oa-TOF. The precursor ion was used as the reference for accurate measurements of product ions in MS/MS spectra. Instrument control, data acquisition, and data processing were achieved using Analyst software (QS 2.0) provided by SCIEX.

Ion Mobility Spectrometry

Ion mobility measurements performed for poly(alkoxyamine phosphodiester)s were achieved using the traveling wave ion mobility (TWIM) cell of a Synapt G2 HDMS mass spectrometer (Waters, Manchester, UK). In this instrument, samples were introduced at a 10 μL min–1 flow rate in the ionization source operated in the negative ion mode (capillary voltage: –2.27 kV; sampling cone voltage: –20 V) under a desolvation gas (N2) flow of 100 L h–1 heated at 35 °C. TWIM-MS spectra were all recorded in the 50–1500 m/z range, with trap bias DC voltage of 45 V, helium cell gas flow of 180 mL min–1, and the TWIMS cell operated at 3.45 mbar of N2. IMS experiments were performed using five different sets of wave velocity (m s–1)/wave height (V) values: 600/40, 650/40, 700/40, 650/35, 650/30. Data analyses were conducted with the MassLynx 4.1 software provided by Waters.

Results and Discussion

Poly(triazole amide)s

Poly(triazole amide)s were the first reported class of digitally-encoded polymers and are prepared by orthogonal solid-phase iterative synthesis [33, 40]. Scheme 1a shows the general molecular structure of these polymers. In these structures, a binary code was implemented in the chains by a propyl (0-bit: R1 = H in Scheme 1a) or an isobutyl (1-bit: R1 = CH3 in Scheme 1a) segment linked to each triazole ring, as well as in the ω end-group (hence designated as ω 0 when R2 = H or ω 1 when R2 = CH3 in Scheme 1a).

Scheme 1
scheme 1

General molecular structure of the digitally-encoded (a) poly(triazole amide)s and (b) poly(alkoxyamine phosphodiester)s studied in the present work

In the first poly(triazole amide) generation, each coded building block also contained a tri(ethylene glycol) spacer located between triazole and amide functions (X = O in Scheme 1a). These tri(ethylene glycol)-based oligo(triazole amide)s were best ionized as multiply protonated molecules in positive ion mode ESI and always exhibited a highly preferential charge state indicating that adducted protons were located on every other monomer unit. In contrast, a very low ionization yield was observed in the negative ion mode upon deprotonation of their acidic α moiety. Moreover, fragmentation efficiency of deprotonated species was extremely poor, with nearly no product ion observed in MS/MS spectra, whereas collision induced dissociation of multiply protonated oligomers allowed their sequence to be reconstructed. Their dissociation behavior was recently reported in detail [37]; it is briefly summarized hereafter and illustrated with the case of α-0000 0 in Figure 1a. Upon CID, two main reactions were observed to occur in each monomer. Cleavage of each amide bond yielded the most abundant product ions and would proceed according to a proton-assisted mechanism, as most often reported for amide-containing polymers such peptides [42] or polyamides [43, 44]. Based on a nomenclature adapted from the one proposed for synthetic polymer fragments [45] (Supplementary Scheme S1), these product ions were named di n+ when containing the α moiety and yi n+ when holding the original ω end-group (both annotated in blue, Figure 1a).

Figure 1
figure 1

ESI-MS/MS of [α-0000 0 + 2H]2+ ions from (a) the first generation poly(triazole amide)s containing a tri(ethylene oxide) segment (m/z 681.4), and (b) the second generation poly(triazole amide)s containing an alkyl segment (m/z 669.5), with their respective structure and dissociation scheme on the right-hand side. Peaks annotated by @ designate dehydration products, whereas ions designated by # were produced upon loss of N2. All peak assignments were supported by accurate mass measurements (Supplementary Table S1)

The second main reaction consisted of cleavage of the last ether bond in each repeating unit after proton attachment to the ether oxygen, followed by nucleophilic attack of the nearby amide oxygen. This dissociation pathway yielded the ai n+/vi n+ pair of complementary fragments (both annotated in red Figure 1a). Additional secondary pathways, such as dehydration of the α end-group and loss of N2 from triazole rings, were identified to proceed from both the precursor ion and all primary ions.

The two pairs of complementary products generated from each repeating unit could be used to accurately reconstruct any oligomer sequence from one or the other chain end. However, compared with other sequence-controlled polymers such as oligo(alkoxyamine amide)s [36] or oligocarbamates [35] undergoing a single bond cleavage per monomer, the CID behavior of protonated oligo(triazole amide)s did not allow a straightforward MS/MS reading step. Even for small oligomers such as the α-0000 0 species shown in Figure 1a, complex MS/MS spectra were obtained. Owing to the location of adducted protons on every other monomeric unit in the precursor ions, charge state of product ions in each series increased with their size, but not in a linear manner. Moreover, some fragments were also detected at two different charge states. Peak assignment in each ion series hence implied to start from the lowest members and to search for the next congener by adding the mass of one of the repeating unit (312.2 Da or 326.2 Da when containing 0 or 1, respectively) divided by the fragment charge state. Owing to the low resolving power of the quadrupole mass analyzer used to select precursor ions, a partial isotopic pattern was detected for all product ions and permitted to identify their charge state. However, safe assignment of product ions of increasing charge states generated from long chains could only be performed when using a quite high resolution mass analyzer such as a TOF device for the second MS stage.

Based on the MS/MS behavior established for tri(ethylene glycol)-based oligo(triazole amide)s, a new design was conceived to improve their MS/MS readability. Considering that the ai n+/vi n+ ion pairs obtained after cleavage of the ether bond only added supplementary, rather than complementary, sequence information compared with the di n+/yi n+ ion pair generated upon dissociation of the amide linkage (Figure 1a), one way to simplify MS/MS sequencing of oligo(triazole amide)s would be to change the ethylene oxide segment to an alkyl chain so that the ether-fragmentation pathway can be avoided. Accordingly, an undecyl spacer was used in lieu of tri(ethylene glycol) to prepare new coded dyads (with X = CH2 in Scheme 1a). As expected, greatly simplified MS/MS data were obtained for this new generation of oligo(triazole amide) (Figure 1b). Providing low activation energies were used, di n+ and yi n+ product ions were observed as the most intense peaks and could hence readily be identified, while secondary fragments remained of low abundance or in the low m/z range of the MS/MS spectra. Changing the tri(ethylene glycol) segment to an undecyl spacer to keep an equivalent monomer length permitted to avoid production of ai n+ and vi n+ product ions without introducing any new dissociation route, thanks to the low reactivity of alkyl segments upon collisional activation. Overall, using the same synthetic approach but different reagents, redesigning poly(triazole amide)s based on their CID behavior allowed a new polymer generation with greatly simplified MS/MS sequencing.

Poly(alkoxyamine phosphodiester)s

Poly(alkoxyamine phosphodiester)s is a recently-reported type of information-containing polymers that are synthesized using orthogonal successive phosphoramidite and radical–radical coupling [30]. They were conceived to combine advantages of poly(phosphodiester)s and poly(alkoxyamine amide)s, from both synthesis and MS/MS sequencing viewpoints, while avoiding their respective drawbacks.

Sequence-defined non-natural polyphosphates were prepared using iterative phosphoramidite protocols on a solid support, as initially developed for oligonucleotide synthesis [46]. This approach allowed synthesis of near-monodisperse homopolymers and sequence-encoded copolymers [28], and further automation of the method on a DNA synthesizer permitted to store more than a decabyte of data in a single chain [29]. Although extremely promising for information-related technologies [25], poly(phosphodiester)s still suffer from two main drawbacks. On the one hand, due to some non-chemoselective reactions during their synthesis, there was a need for protection/deprotection of the hydroxyl group of co-monomers. On the other hand, poly(phosphodiester)s typically dissociate according to rules defined for deprotonated oligonucleotides [13], with four sets of complementary product ions arising from cleavage at the phosphate bonds. As a result, long chains gave rise to very complex CID spectra from which their sequence can only be deciphered when using mass analyzers offering very high resolving power. In contrast, the molecular structure of poly(alkoxyamine amide)s, with amide synthons, –NH–(CO)–C(CH3)R– (where R=H codes for 0 and R=CH3 for 1) spaced by a TEMPO nitroxide, allowed very simple MS/MS sequencing rules [31]. CID of oligo(alkoxyamine amide)s was indeed observed to occur via competitive homolytic cleavages of all C–ON alkoxyamine linkages, either in the positive [36] or the negative [38] ion mode (Supplementary Figure S1). However, the chemical structure of oligo(alkoxyamine amide)s also raised some issues in MS/MS. For protonated molecules, relative rate of competing C–ON bond cleavages was directly correlated to the stability of the carbon-centered radical (secondary versus tertiary when cleavage occurred in a 0 or a 1 repeating unit, respectively) generated during this homolytic reaction (Supplementary Figure S1a). Location of the adducted proton at any nitroxide nitrogen introduced an additional level of sequence-dependent fragmentation [36]. To avoid this protonation-related effect, the negative ion mode was successfully used to generate and sequence oligo(alkoxyamine amide)s from a single series of product ions that contained the deprotonated acidic α termination [38]. Although relevant for short chains (Supplementary Figure S1b), this alternative approach also became limited as the polymer length increased due to (1) decreasing acidity of the α termination, and (2) a methyl scrambling effect evidenced during CID of oligomers starting with specific sequences (such as α-100 or α-101) [47].

Nevertheless, due to their very low dissociation energy, which made their cleavage highly competitive compared to that of any other covalent bonds, alkoxyamine linkages remained an extremely valuable option in the design of coded polymers that need to be easy-to-read by MS/MS. Similarly, the TEMPO moiety was advantageously used to induce extensive backbone fragmentation in so-called free radical initiated peptide sequencing [48, 49]. An alkoxyamine bond was hence introduced in each monomeric unit of poly(alkoxyamine phosphodiester)s by implementing successive chemoselective reactions involving protecting-group-free coupling steps, namely phosphoramidite coupling and nitroxide radical coupling [30]. In this design (Scheme 1b), nitroxides were purposely no longer linked to the coding moieties but to a dimethyl substituted carbon, in order to favor high dissociation rate for all C–ON bond cleavages and prevent sequence-dependent fragmentation. Moreover, presence of one phosphate group per monomer ensured an efficient deprotonation of poly(alkoxyamine phosphodiester)s, allowing an exclusive use of the negative ion mode regardless of their chain length.

However, detection of oligomers with multiple charge states in ESI raised new issues in both MS and MS/MS. A very first sample of the early generation of oligo(alkoxyamine phosphodiester)s was prepared with 0 a monomers containing a TEMPO nitroxide (Y = Ø in Scheme 1b). As shown in Supplementary Figure S2, this sample exhibited a marked polydispersity, with oligomers containing n = 3–5 monomeric units [30]. Nevertheless, this example usefully illustrates the quite large charge state distribution of these short oligomers. While the 3-mer was mainly detected with two (64.4%) and three (33.6%) negative charges (the –1 charge state only contributed to 2.0% of the whole signal), relative intensities of most peaks assigned to the 4-mer were quite high (33.0% for z = 2, 49.8% for z = 3, and 17.2% for z = 4) and charge state dispersity further increased for the 5-mer (21.0% for z = 2, 18.1% for z = 3, 50.2% for z = 4, and 10.7% for z = 5). Of note, fully deprotonated oligomers (with z = n) were never observed to be the most stable species, although their selection as precursor ions was found to highly simplify MS/MS data. Indeed, the unique dissociation route experienced by deprotonated oligo(alkoxyamine phosphodiester)s was the homolytic cleavage of all alkoxyamine bonds, yielding pairs of complementary product ions named ci z– when carrying the α termination and yi z– when holding the ω end-group (Supplementary Scheme S1b). However, activation of partially ionized oligomers with randomly deprotonated phosphate groups lead to fragments detected with different charge states (Figure 2a), and hence to more complex MS/MS spectra compared with fully-ionized molecules yielding products ions with a unique charge state, equal to the number of monomers they contain (Figure 2b). For the largest oligo(alkoxyamine phosphodiester)s containing one byte of information, deprotonation of the phosphate group of all monomers was readily achieved (vide infra). However, in order to determine the actual application range of this method, larger polymers have to be produced to find out whether this requirement is still fulfilled, since self-solvation of lower charge states as the chain size increases might prevent deprotonation of all ionizable sites.

Figure 2
figure 2

ESI-MS/MS of the deprotonated α-(0 a )4-ω tetramer with (a) z = 3 and (b) z = 4, both recorded at the same 0.75 eV center-of-mass collision energy. Although both ci z– and yi z– are actually radical anions, the superscripted dot that should have been used to designate radicals has been omitted for the sake of clarity. Peaks annotated with grey italicized m/z values correspond to internal fragments. Top: fragmentation schemes of each ionic species, with α-containing fragments (ci z–) in pink and ω-containing fragments (yi z–) in blue. All peak assignments were supported by accurate mass measurements (Supplementary Table S2)

More importantly, peak assignment was extremely straightforward in CID spectra obtained for oligomers with all phosphate groups ionized (Figure 2b). As z = i for any ci z– and yi z– generated from fully deprotonated precursors, the mathematical relationship between two consecutive F fragments in any of the two ion series can be written as

$$ m/z\left({F}_{i+1}^{\left(i+1\right)-}\right)=\frac{m/z\left({F}_i^{i-}\right)\times i+{m}_{monomer}}{i+1} $$
(1)

where mmonomer is the mass of the deprotonated monomer at the (i + 1)th position. As a result, the Δm/z difference measured between these two fragments is

$$ \Delta m/z=m/z\left({F}_{i+1}^{\left(i+1\right)-}\right)-m/z\left({F}_i^{i-}\right)=\frac{m_{monomer}}{i+1}-\frac{m/z\left({F}_i^{i-}\right)}{i+1} $$
(2)

Because of the particular location of C–ON linkages in oligo(alkoxyamine phosphodiester)s (Scheme 1b), the mass of the deprotonated monomer (377.2 Da for 0 a ) is larger than the mass of the structural segment at the left-hand side of the first cleavable alkoxyamine bond (281.1 Da when deprotonated). As a result, values calculated according to Equation 2 were always positive in the case of ci i– ions: m/z values of these fragments increased with their size while (obviously) remaining lower than that of the precursor ion. In contrast, the mass of the deprotonated monomer being lower than that of the segment at the right-hand side of the last C–ON bond (549.2 Da when deprotonated), negative values were obtained when applying Equation 2 for yi i– ions. As a result, m/z values of these fragments decreased as their size increased while (obviously) remaining higher than that of the precursor ion. In other words, and as clearly illustrated in Figure 2b, c i z– ions (annotated in pink) were all found at m/z values below that of the [4-mer – 4H]4– and assigned from the left-hand side to the right-hand side for increasing i, whereas all peaks at m/z above that of the precursor ion could be safely assigned to yi z– ions (annotated in blue), to be read from the right- to the left-hand side for increasing i. Such assignments were far less straightforward when activating precursor ions at a lower charge state, with ci z– and yi z– ions distributed throughout the CID spectrum as exemplified for [4-mer – 3H]3– in Figure 2a. Finally, beside the two main ion series, additional fragments (annotated with italicized m/z values in Figure 2) were also observed. As supported by accurate mass measurements (Supplementary Table S2), they were proposed to form upon evaporation of monomer(s) from primary ci z– and yi z– product ions, and would be either cyclic or biradical species, designated as [(0 a )i – zH]z–. When generated from fully deprotonated precursors, the charge state of these internal fragments was equal to the number of monomers they contained: as a result [0 a – H], [(0 a )2 – 2H]2–, and [(0 a )3 – 3H]3– were all measured at the same m/z 377.2 value in Figure 2b. In contrast, when activating a partially deprotonated oligomer, the same internal fragments were detected as multiple peaks ([0 a – H] and [(0 a )2 – 2H]2– at m/z 377.1, [(0 a )2 – H] at m/z 755.4, and [(0 a )3 – 3H]2– at m/z 566.3 in Figure 2a), further complicating MS/MS data. In summary, CID of fully deprotonated poly(alkoxyamine phosphodiester)s is highly advantageous for their sequencing, so their preferential production in negative mode ESI has to be favored.

Relative abundance of different charge states of a molecule in ESI mass spectra typically reflects their relative stability in the gas phase. Charge state distribution observed for oligo(alkoxyamine phosphodiester)s composed of 0 a units containing TEMPO (Supplementary Figure S2) suggested that coulombic repulsions between adjacent ionized phosphate groups were too strong for fully deprotonated oligomers to be the most stable species. Coulombic repulsions could, however, be minimized by increasing the distance between deprotonable phosphate groups. A first option to do so was to increase the size of the nitroxide building block. Using the TEMPO-hexanamide nitroxide (Scheme 1b) instead of TEMPO, oligo(alkoxyamine phosphodiester)s exhibited a preferential charge state equal to their polymerization degree, as illustrated for the α-0 a 0 a 0 a 0 a -ω molecule in Figure 3a where nearly 50% of this 4-mer signal was measured for [M – 4H]4–.

Figure 3
figure 3

ESI mass spectra of the 4-mers prepared with (a) 0 a units containing a propyl coding moiety, (b) 0 b units containing a butyl coding moiety, or (c) 0 c units containing a pentyl coding moiety, all including the TEMPO-hexanamide nitroxide. The peak annotated with a star in Figure 3b corresponds to an impurity

Keeping TEMPO-hexanamide as the nitroxide moiety, another option to maximize the distance between phosphate groups consisted of increasing the length of the coding alkyl segment. Compared with the 4-mer constructed with 0 a monomers (coding segment = propyl) for which the 4-/3-/2- intensity ratio was 47.8/33.3/18.9 (Figure 3a), addition of only one methylene group by using 0 b repeating units (Scheme 1b) greatly modified the charge state distribution, with the most stable [M – 4H]4– species now contributing to 88.3% of the whole signal (Figure 3b). Surprisingly, addition of another methylene group was observed to have an opposite effect, as shown in Figure 3c with the 4-mer containing 0 c repeating units (Scheme 1b): changing 0 b to 0 c repeating units, relative stability of the highest charge state was observed to decrease compared with the lowest ones (64.4% for z = 4, 20.0% for z = 3, and 15.0% for z = 2). This result strongly suggested that pentyl coding segment allowed a particular conformation of the 0 c -based 4-mer, where phosphate groups would be closer to each other compared with homologues containing 0 b monomers. In order to verify this assumption, these three samples were analyzed by ion mobility spectrometry. While a 4% drift time increase was measured for [α-(0 b )4-ω – 4H]4– compared with [α-(0 a )4-ω – 4H]4–, drift times measured for [α-(0 b )4-ω – 4H]4– and [α-(0 c )4-ω – 4H]4– were not statistically different (Supplementary Figure S3). In other words, the fully deprotonated 4-mer containing pentyl coding moieties ([α-(0 c )4-ω – 4H]4– at m/z 537.3) exhibited the same collision cross-section as its homologue with butyl coding segments ([α-(0 b )4-ω – 4H]4- at m/z 523.3). This could be explained by a more pronounced folding of the m/z 537.3 ion structure, which would result in a smaller distance between deprotonated phosphate groups and therefore increase coulombic repulsion detrimental to the stability of [α-(0 c )4-ω – 4H]4–. Alternatively, an increased stabilization of the lower charge states of the 0 c tetramer could account for this result.

Consequently, synthesis of information-encoded oligo(alkoxyamine phosphodiester)s was further conducted using the optimal 0 b/1 b coding monomers to ensure highly abundant signal for precursor ions at the maximal charge state. As depicted in Supplementary Figure S4, the 0 b 1 b 0 b 0 b 1 b 1 b 1 b 0 b macromolecule exhibited a most stable z = 8 charge state, consistent with its number of monomers. This charge state remained preferential regardless of the applied declustering potential (DP), and optimal data were obtained at DP = –50 V, in terms of absolute abundance but also relative abundance compared with the z = 7 charge state (Supplementary Figure S3). The MS/MS spectrum of [M – 8H]8– at m/z 520.9 exhibited the two expected groups of product ions respectively distributed on one or the other side of the precursor ion peak, with ci i– with i = 1–8 and yi i– with i = 1–7 (Figure 4).

Figure 4
figure 4

ESI-MS/MS spectrum of the fully deprotonated 01001110 poly(alkoxyamine phosphodiester) at m/z 520.9 (z = 8), written with the 0 b /1 b coding system containing the TEMPO-hexanamide nitroxide (collision energy: 0.70 eV, center-of-mass frame). Top: dissociation scheme of the m/z 520.9 polyanionic precursor, with ci i– and yi i– fragments designated in pink and in blue, respectively. Peaks annotated with grey italicized m/z values correspond to internal fragments. All ion assignments were supported by accurate mass measurements (Supplementary Table S3)

Considering the mass of each deprotonated monomer (i.e., m = 504.3 Da for 0 b and m = 518.3 Da for 1 b ), the byte of information encoded in this oligo(alkoxyamine phosphodiester) was readily deciphered from CID data. Starting from c1 , which m/z 295.1 implied that the sequence began as α-0 b , simple application of Equation 2 allowed the next coding unit to be safely characterized. Indeed, the 111.6 m/z difference measured between c2 2– (m/z 406.7) and c1 indicated a mass of 518.3 Da for the second monomer, identified as 1 b . Applying the same operation throughout the ci i– series allowed the whole sequence to be reconstructed from the left-hand side to the right-hand side, while a similar analysis of yi i– product ions was conducted to read the digitally-coded message from the ω to the α chain-end (Supplementary Table S4). It should be noted, however, that the particular MS/MS pattern achieved for these designed structures, with decreasing m/z differences between consecutive congeners as their charge state increased, will raise a resolution issue when analyzing much larger poly(alkoxyamine phosphodiester)s. As illustrated in Supplementary Figure S5, simulations predicted isotopic maxima within a 0.5 m/z range for ci i– fragments composed of 0 b monomers only and i ranging from 45 to 50, hence implying severe interferences of isotopic patterns. Use of mass analyzers with high resolving capabilities would then be mandatory for safe assignment of such highly charged product ions.

Finally, consistent with a chain constructed with two repeating units of different mass, internal fragments (annotated with italicized grey m/z values in Figure 4) were detected as multiple peaks depending on their co-monomeric composition (Supplementary Table S3). As they do no longer contain any of the original chain ends, internal fragments cannot be used for sequencing purposes. However, as pieces of the polymer backbone, they can still be utilized to support the co-monomeric sequence revealed by primary product ions. For example, while internal fragments containing a single type of monomer were observed with at most two 0 b units ([(0 b )2 – 2H]2– at m/z 504.3) in Figure 4, they were detected with up to three 1 b monomers ([(1 b )3 – 3H]3– at m/z 518.3), consistent with structural motifs in the 01001110 sequence.

Conclusion

The structure of sequence-controlled synthetic polymers can be manipulated to achieve the most simple dissociation pattern upon their activation. As illustrated here with poly(triazole amide)s and poly(alkoxyamine phosphodiester)s, optimal structures were obtained by considering parameters such as the nature of functional groups, their relative location in monomeric units, as well as relative dissociation energy of chemical bonds along the polymer skeleton. Operated structural modifications permitted to avoid redundant MS/MS information, such as multiple product ion series or numerous peaks assignable to the same fragment at different charge states, but also gave rise to CID patterns from which sequence reconstruction was extremely simplified. It should also be emphasized that preparation of the targeted structures only required minimal changes of the initial synthesis protocols. Tandem mass spectrometry, hence, appears as a key technique not only for the sequencing but also for the design of digitally encoded macromolecules. This new paradigm represents a real breaking point in the conception of sequence-controlled polymers aimed at storing information that should be efficiently recovered by MS/MS.