Appendix 1: Generation of expected peak pattern
The evaluation of sets of expected peak patterns (Eq. 7) and of the values of the scoring function (Eq. 12) are the core elements of the ASCAN approach. In each ASCAN iteration, these two quantities are derived from the set of already known resonance frequencies (Eq. 1). In this treatment so far, the three frequency dimensions in the Eqs. 7 and 8 were selected with the sole criteria that two out of the three frequencies of an expected peak must be restrained by previous resonance assignments, and they did not necessarily have to correspond to the frequency dimensions of the NMR experiment used. In the following, however, the notation is more specific, with the three frequency coordinates Δω
1, Δω
2 and Δω
3 of a cross peak in a 3D spectrum corresponding, in this order, to the indirect proton frequency, the heavy atom frequency, and the direct proton frequency. For the actual computation of E(u) and \( F(\Upomega _{{u_{k} }} ) \)with Eqs. 7 and 12, respectively, we distinguish three situations, depending on the extent of the previous assignments and the chemical structure of the amino acid side chain to be assigned.
Firstly, if the resonance frequency, \( \Upomega _{h(u)} , \) of the 13C or 15N atom, \( h\left( u \right) \in A, \) that is covalently bound to an unassigned 1H atom, u, with \( u \in P\backslash A, \) is known, then the set of assigned atom pairs, D(u), used to determine the expected peak pattern for an unassigned 1H atom, u, is derived from all so far assigned 1H atoms, \( p_{i} \in A, \) that satisfy Eq. 6. Since \( h\left( u \right) \in A, \) the set of atom pairs D(u) can then be generated either with Eqs. 4 or 5. Consequently, the set of expected peaks for u is composed of two subsets,
$$ E(u) = E^{1} (u) \cup E^{2} (u) , $$
(15)
which represent, respectively, the situations where either the frequency of the heavy atom bound to the previously assigned hydrogen atom, h (p
i
), or of the heavy atom bound to the unassigned hydrogen atom, h (u), is known:
$$ E^{1} (u) \equiv \left\{ {\overrightarrow {{e^{1} (u)_{i} }} \equiv \left( {\Upomega _{u} ,\Upomega _{{h(p_{i} )}} ,\Upomega _{{p_{i} }} } \right)|(h(p_{i} ),p_{i} ) \in D(u);i = 1 \ldots M^{1} } \right\} $$
(16)
$$ E^{2} (u) \equiv \left\{ {\overrightarrow {{e^{2} (u)_{i} }} \equiv \left( {\Upomega _{{p_{i} }} ,\Upomega _{h(u)} ,\Upomega _{u} } \right)|(h(u),p_{i} ) \in D(u);i = 1 \ldots M^{2} } \right\} $$
(17)
In Eqs. 16 and 17, M
1 and M
2 denote the number of peaks in the two subsets (see also Eq. 7). The scoring function, F (Eq. 12), yields a scoring value for each potential resonance frequency, \( \Upomega _{{u_{k} }} \in R(u), \) which is calculated as follows:
$$ F(\Upomega _{{u_{k} }} ) = \frac{1}{{\left| {M^{1} } \right|}}\sum\limits_{i = 1}^{{M^{1} }} {C(\overrightarrow {{e^{1} (u)_{i} }} } ,\Updelta \vec{\omega }) + \frac{1}{{\left| {M^{2} } \right|}}\sum\limits_{i = 1}^{{M^{2} }} {C(\overrightarrow {{e^{2} (u)_{i} }} } ,\Updelta \vec{\omega }) $$
(18)
In Eq. 18 the sums run over all the elements of the two subsets of expected peaks, \( \overrightarrow {{e^{1} (u)_{i} }} \in E^{1} (u) \) and \( \overrightarrow {{e^{2} (u)_{i} }} \in E^{2} (u). \)In addition to the three acceptance criteria formulated with Eqs. 13 and 14, each potential resonance frequency, \( \Upomega _{{u_{k} }} \in R(u), \) must result from at least one confirmed match between observed peaks and expected peaks from the subset E
2(u), otherwise the value of the scoring function \( F(\Upomega _{{u_{k} }} ) \) is reset to zero. This additional requirement ensures that at least one observed peak used to identify the new resonance frequency of a previously unassigned hydrogen atom, u, is compatible with the known resonance frequency of its covalently bound heavy atom, Ω
h(u)
.
Secondly if the resonance assignments for a 1H atom, u, and its covalently bound heavy atom, h(u), are both unknown, with \( u,h(u) \in P\backslash A, \) then a two-step procedure is employed, whereby initially a set of potential resonance frequencies, R(u), for the unassigned proton, u, is obtained, and subsequently a set of potential resonance assignments for h(u), \( R(\Upomega _{{u_{k} }} ;h(u)) \) is identified, with \( \Upomega _{{u_{k} }} \in R(u). \)In the initial step, an analogous procedure to the one described above is applied, where the set of expected peaks involving the resonance frequency of the so far unassigned 1H atom, u, is defined by
$$ E(u) \equiv \left\{ {\overrightarrow {{e(u)_{i} }} \equiv \left( {\Upomega _{u} ,\Upomega _{{h(p_{i} )}} ,\Upomega _{{p_{i} }} } \right)|(h(p_{i} ),p_{i} ) \in D(u);i = 1 \ldots M} \right\} . $$
(19)
The mapping function, Q, between the sets of expected peaks, E(u), and the sets of observed peaks, O(u) ⊂ S, for a 1H atom, u, yields a set of potential resonance frequencies, R(u) (Eqs. 9 and 10), and for each potential resonance frequency, \( \Upomega _{{u_{k} }} \in R(u), \) a value of the scoring function, \( F(\Upomega _{{u_{k} }} ) \) is calculated with Eq. 12. Subsequently, a set of potential resonance assignments for the covalently bound heavy atom of the proton u, h(u), is determined, with \( \Upomega _{{u_{k} }} \in R(u). \)The set of expected peaks correlating with h(u) is then defined by
$$ E(\Upomega _{{u_{k} }} ;h(u)) \equiv \left\{ {\overrightarrow {{e(\Upomega _{{u_{k} }} ;h(u))_{j} }} \equiv \left( {\Upomega _{{p_{j} }} ,\Upomega _{h(u)} ,\Upomega _{{u_{k} }} } \right)|p_{j} \in A;j = 1 \ldots M^{{\Upomega _{{u_{k} }} }} } \right\} . $$
(20)
A set of observed peaks, \( O(\Upomega _{{u_{k} }} ;h(u)) \subset S, \) is extracted from the updated list of the local extrema identified by a two-dimensional grid spanned by the potential resonance frequencies of u, \( \Upomega _{{u_{k} }} \in R(u), \) and all the chemical shifts of the previously assigned hydrogen atoms, \( p_{j} \in A. \)The mapping between the set of expected peaks, \( E(\Upomega _{{u_{k} }} ;h(u)), \) and the set of observed peaks, \( O(\Upomega _{{u_{k} }} ;h(u)) \) with Eq. 9 then yields a set of potential resonance frequencies, \( R(\Upomega _{{u_{k} }} ;h(u)) \) (Eq. 10). For each of these potential resonance frequencies, \( \Upomega _{{h(u)_{j} }} \in R(\Upomega _{{u_{k} }} ;h(u)), \) a value for the scoring function \( F(\Upomega _{{h(u)_{j} }} ) \)is then calculated with (Eq. 12). At the end of the two-stage procedure for this combined hydrogen and heavy atom assignment, a set of potential resonance frequency pairs,
$$ \Upomega (u,h(u)) \equiv \left\{ {(\Upomega _{{u_{k} }} ,\Upomega _{{h(u)_{j} }} )|\Upomega _{{u_{k} }} \in R(u),\Upomega _{{h(u)_{j} }} \in R(\Upomega _{{u_{k} }} ;h(u))} \right\} , $$
(21)
is obtained for the unassigned 1H atom, u, and its covalently bound heavy atom, h(u). For each potential pair of resonance frequencies, \( \left( {\Upomega _{{u_{k} }} ,\Upomega _{{h(u)_{k} }} } \right) \in R(u,h(u)), \) two scoring values, \( F(\Upomega _{{u_{k} }} ) \) and \( F(\Upomega _{{h(u)_{j} }} ), \) are calculated with Eq. 12, and the atom pair u, h(u), is added to the list of assigned atoms if the acceptance criteria of Eqs. 13 and 14 are met simultaneously for both potential resonance frequencies of the atom pair.
Thirdly, for 13C–1H fragments in aromatic rings, with, \( u,h(u) \in P\backslash A, \) the resonance assignments for the two atoms in the 13C–1H moiety can be obtained in a single step if the nearest-neighbor 13C–1H group in the aromatic ring has previously been assigned.
The set of assigned atom pairs used to derive the expected peak pattern for the so far unassigned aromatic 13C–1H moiety then consists of a single element, which is composed of the two assigned atoms Hδ and Cδ of the same aromatic ring:
$$ D(u,h(u)) \equiv \left\{ {(C^{\delta } ,H^{\delta } )} \right\} $$
(22)
The set of expected peaks for the unassigned aromatic 1H atom, \( u \in P\backslash A, \) and its covalently bound heavy atom, \( h(u) \in P\backslash A, \) is then defined by
$$ E(u,h(u)) \equiv \left\{ {\left( {\Upomega _{{(C^{\delta } ,H^{\delta } )_{i} }} ,\Upomega _{h(u)} ,\Upomega _{u} } \right)|i = 1, \ldots ,M} \right\}, $$
(23)
where the frequencies \( \Upomega _{{(C^{\delta } ,H^{\delta } )_{i} }} \) are determined using the local extrema that match the frequency coordinates given by Eq. 24 within a tolerance window \( \Updelta \vec{\omega } = (\Updelta \omega ^{p} ,\Updelta \omega ^{h} ,\Updelta \omega ^{p} ) \) (see Table 1):
$$ \vec{\omega }(C^{\delta } ,H^{\delta } ) = \left( {(\omega _{{(C^{\delta } ,H^{\delta } )_{i} }} ,\omega _{2,i} ,\omega _{3,i} )|\left| {\omega _{2,i} - \Upomega _{{C^{\delta } }} } \right| < \Updelta \omega ^{h} \wedge \left| {\omega _{3,i} - \Upomega _{{H^{\delta } }} } \right| < \Updelta \omega ^{p} ;i = 1, \ldots ,M} \right) \subset S $$
(24)
The mapping function, Q, between the set of expected peaks in Eq. 23, E(u, h(u)), and the set of observed peaks, O(u, h(u)) (Eq. 8), then yields a set of potential resonance frequencies, R(u, h(u)) (Eq. 10), and for each pair of potential resonance frequencies, \( \left( {\Upomega _{{u_{k} }} ,\Upomega _{{h(u)_{k} }} } \right) \in R(u,h(u)), \) a single value for the scoring function \( F(\Upomega _{{u_{k} }} ,\Upomega _{{h(u)_{k} }} ) \)is evaluated (Eq. 12). Finally, the aromatic ring 13C–1H moiety with \( \left( {\Upomega _{{u_{k} }} ,\Upomega _{{h(u)_{k} }} } \right) \in R(u,h(u)) \) is added to the list of assigned atom if the acceptance criteria of Eqs. 13 and 14 are met.
Appendix 2: Use of ASCAN with supplementary input of TOCSY data
In addition to accepting an input of 3D 13C- or 15N-resolved [1H,1H]-NOESY data, ASCAN is also laid out to operate on 3D heteronuclear-resolved [1H,1H]-TOCSY data sets. The correlation of an unassigned atom, \( u \in P\backslash A, \) to the set of assigned atom pairs, D(u), is then based on the TOCSY magnetization transfer pathways. Sets of assigned atom pairs, D(u), arising from scalar (“through-bond”) coupling of an unassigned hydrogen atom, \( u \in P\backslash A, \) with assigned hydrogen atoms, \( p_{i} \in A, \) are derived from the covalent structure based on the fact that only pairs of hydrogen atoms, u and p
i
, which are separated by a given number, n, of covalent bonds, \( n_{{{\text{up}}_{i} }}^{\text{cov}} , \) give rise to a scalar coupling in the same spin system. All correlated pairs of hydrogen atoms determined with Eq. 25,
$$ 2 \le n_{{up_{i} }}^{\text{cov}} \le 3, $$
(25)
are used to generate the elements of D(u) (Eq. 3) with the Eqs. 4 and/or 5. Overall, the treatment of [1H,1H]-NOESY and [1H,1H]-TOCSY data by ASCAN differs only by the considerations given to the different coherence transfer pathways in the two experiments, as reflected in the Eqs. 6 and 25. In practice, TOCSY data sets can be a useful supplement to the NOESY input data, and they should be used exclusively in conjunction with NOESY data.