Towards autonomous analysis of chemical exchange saturation transfer experiments using deep neural networks

Macromolecules often exchange between functional states on timescales that can be accessed with NMR spectroscopy and many NMR tools have been developed to characterise the kinetics and thermodynamics of the exchange processes, as well as the structure of the conformers that are involved. However, analysis of the NMR data that report on exchanging macromolecules often hinges on complex least-squares fitting procedures as well as human experience and intuition, which, in some cases, limits the widespread use of the methods. The applications of deep neural networks (DNNs) and artificial intelligence have increased significantly in the sciences, and recently, specifically, within the field of biomolecular NMR, where DNNs are now available for tasks such as the reconstruction of sparsely sampled spectra, peak picking, and virtual decoupling. Here we present a DNN for the analysis of chemical exchange saturation transfer (CEST) data reporting on two- or three-site chemical exchange involving sparse state lifetimes of between approximately 3–60 ms, the range most frequently observed via experiment. The work presented here focuses on the 1H CEST class of methods that are further complicated, in relation to applications to other nuclei, by anti-phase features. The developed DNNs accurately predict the chemical shifts of nuclei in the exchanging species directly from anti-phase 1HN CEST profiles, along with an uncertainty associated with the predictions. The performance of the DNN was quantitatively assessed using both synthetic and experimental anti-phase CEST profiles. The assessments show that the DNN accurately determines chemical shifts and their associated uncertainties. The DNNs developed here do not contain any parameters for the end-user to adjust and the method therefore allows for autonomous analysis of complex NMR data that report on conformational exchange. Supplementary Information The online version contains supplementary material available at 10.1007/s10858-022-00395-z.


Introduction
Many functional aspects of a macromolecule can be understood from its time-averaged three-dimensional structure. However, often the functionality of these molecules depends on their ability to exchange between different conformational states. Thus, quantifying the interconversion between these states is an important first step towards understanding how these biomolecules work (Yang et al. 2003;Karplus and Kuriyan 2005;Boehr et al. 2006;Henzler-Wildman and Kern 2007;Faust et al. 2020;Xie et al. 2020;Wurm et al. 2021). When conformational exchange is present, there is often one major populated state, the ground state, and a set of transiently low-populated states that, despite their low populations and short lifetimes, often play crucial roles for function. Several NMR techniques are now available to characterise reaction dynamics and transiently populated states at atomic resolution, including, chemical exchange 1 3 saturation transfer (CEST) (Ward et al. 2000;Zhou and Zijl 2006;Vallurupalli et al. 2012), dark-state exchange saturation transfer (DEST) (Bertini et al. 1999;Hansen and Led 2006;Fawzi et al. 2011), Carr-Purcell- Meiboom-Gill (CPMG) (Meiboom and Gill 1958;Loria et al. 1999;Tollinger et al. 2001) relaxation dispersion, and relaxation in the rotating frame (R 1ρ , R 2ρ ) (Palmer and Massi 2006;Hansen et al. 2009;Chao and Byrd 2016). CEST-based methods, which report on conformational exchange involving sparse states with lifetimes ranging from approximately 3-60 ms, have expanded tremendously over the last decade and have provided invaluable insights into the function of macromolecules (Vallurupalli et al. 2017). However, although several tools are available for the analysis of NMR data reporting on conformational exchange, challenges do exist, particularly when the exchange deviates from a simple two-state model (Neudecker et al. 2006). For 1 H CEST methods reporting on the exchange of amide-protons (Yuwen et al. 2017a) and methyl-protons (Yuwen et al. 2017b) analyses are further complicated by anti-phase features caused by the requirement to eliminate 1 H-1 H cross-relaxation effects, leading to broad lineshapes, with resolution significantly more limited than for 'typical' CEST profiles comprised of absorptivelike dips.
Deep learning and deep neural networks (DNNs) have led to huge advances in many fields of science, including computer vision and natural language processing, and the methodology is now a crucial component of many everyday technologies (LeCun et al. 2015). In supervised deep learning, DNNs are trained to map an input to a desired output, and once trained, these networks can perform analyses autonomously. Deep learning is particularly successful at extracting features in complex data ). It has been used for several years within the field of clinical magnetic resonance imaging (MRI) and some of the tools have already been approved by the FDA (Chaudhari et al. 2021) for image enhancement and classification. Within biomolecular NMR there has been a surge in applications of DNNs over the last couple of years, and networks are now available for the reconstruction of sparsely sampled spectra (Hansen 2019;Luo et al. 2020;Qu et al. 2020;, peak picking (Klukowski et al. 2018), estimating initial fitting parameters (Beckwith et al. 2021), and virtual decoupling .
A key hurdle with many machine learning applications is that training robust models requires a large amount of curated training data. The in-depth understanding of the theory behind biomolecular NMR and the ability to simulate even complex NMR experiments means that the required amount of realistic training data can be generated synthetically. Importantly, it has now become clear that DNNs trained on fully synthetic data show robust performance on experimental data (Hansen 2019;, which allows for sophisticated DNNs to be developed for the transformation and analysis of NMR spectra. Overall, there is enormous potential for the development of deep learning approaches for the general analysis of NMR data and in particular for experiments reporting on conformational exchange. Below we have designed and trained DNNs to extract chemical shifts from the notably complex amide-proton anti-phase CEST experiment. The DNNs were trained solely on synthetically generated CEST profiles and are able to extract accurate chemical shifts of exchanging species as well as their uncertainties, thereby demonstrating that NMR data reporting on conformational exchange can be analysed autonomously using deep neural networks. Figure S2 shows the architecture for the DNN used to transform time-domain anti-phase CEST profiles into timedomain in-phase CEST profiles, DNN TR . This architecture is built from two modules, a module akin to a block in the FID-Net architecture ) and a modified LSTM module (Hansen 2019). The reason for this choice was that the main objective for the DNN is to 'decouple' anti-phase CEST profiles, which we have recently shown can be accomplished by the FID-Net architecture . The python code for generating the model architecture in Tensorflow/Keras is provided in Supporting Material and can be downloaded from GitHub. The input to the DNN consists of two vectors of size 2 × 65 = 130. The first vector, cest AP (t) = c 0 holds the zero-filled real Fourier transform (real and imaginary components) of the antiphase CEST profile and the second vector holds the time-points associated with the first vector, t 0 . The output of the network is the in-phase CEST profile, sampled at 128 offsets. The network contained 3,782,423 trainable parameters.

Deep neural network architectures
The second DNN, DNN CS , used to determine chemical shifts and their confidences was built using a densely connected convolutional neural network architecture (Huang et al. 2016), Fig. S4. The input for the network is the output from the first transformation described above, that is, frequency domain data describing the in-phase CEST profile, cest IP (ω), a vector of 128 real points. In its current form, the network detects a maximum of three chemical shifts as well as their confidences and the output of the network is therefore a 3 × 2 tensor, whose elements comprise three chemical shifts and their confidence values. Overall, the network has 1,591,526 trainable parameters. The python code for generating the model in Tensorflow/Keras is provided in Supporting Material and can also be downloaded from GitHub.

Training the deep neural networks
The first DNN, DNN TR , was trained on 15 × 10 6 anti-phase CEST profiles over 1500 epochs, where the range of training data is detailed in Table 1. An epoch refers to a single cycle of training of the neural network with training data. The training data was generated on-the-fly using code written in python and using functions from the Tensorflow and numpy libraries. To obtain smooth simulated CEST profiles, similar to those generated by experiment, previous simulations have used a distribution of B 1 fields or other dephasing methods (Vallurupalli et al. 2012). Here the dephasing was achieved by only retaining the eigenvectors of the Liouvillian corresponding to real eigenvalues in the propagator. Thus, if L is the matrix describing the Liouvillian, under which the spin-system evolves during the CEST period, then the eigenvalues and eigenvectors of L are initially found: L Λ = Λ D, where Λ is a matrix of eigenvectors and D is a diagonal matrix of eigenvalues. The submatrix of D that holds the real eigenvalues is denoted D re and the matrix holding the eigenvectors corresponding to the real eigenvalues is denoted Λ re . Propagation of the spin-system is carried out with the propagator, Λ re exp(− T ex D re ) Λ −1 re . As an example, for a simple Liouvillian, L, represented by a 3 × 3 matrix in the basis set of the three product operators, I x , I y , and I z there is typically only one real eigenvalue. After an eigendecomposition of L, the matrix holding the eigenvectors, Λ, and the diagonal matrix holding the eigenvalues, D, are 3 × 3 matrices. The submatrix Λ re has dimensions 3 × 1, D re , is a 1 × 1 matrix, and Λ −1 re is a 1 × 3 matrix. Thus, Λ re D re Λ re −1 produces a 3 × 3 matrix that is the projection of the original Liouvillian onto the space spanned by the real eigensystem and Λ re exp(− T ex D re ) Λ re −1 is the propagator corresponding only to the real eigensystem. For the code written with the Tensorflow library functions, where sizes of matrices should remain constant, the dephasing is achieved by multiplying any eigenvalue that has an imaginary part larger than 10 -3 by 10 9 , which means that evolutions caused by non-real eigenvalues are eliminated within nanoseconds.
The anti-phase CEST profiles were then obtained by propagating the Liouvillian over the first INEPT and the CEST element in the anti-phase 1 H N pulse sequence. For each anti-phase CEST profile an in-phase CEST profile was also generated by setting 1 J HN = 0 Hz and integrating the Liouvillian over the CEST element (Vallurupalli et al. 2012). The stochastic ADAM (Kingma and Ba 2014) optimiser was employed with standard parameters and an adaptive learning rate calculated as 0.0004 × L freq + L uncer 3∕4 (final learning rate of 10 -6 ). A batch size of 256 was used throughout the training and random gaussian noise was added with a standard deviation of 0.01 of the maximum value of each anti-phase CEST profile. After training the DNN TR network the DNN CS network was trained. The input data for training the DNN CS network was obtained from output of the trained DNN TR network. Random gaussian noise with a standard deviation between 0.001 and 0.04 of the maximum value of each anti-phase CEST profile was added to anti-phase CEST profiles before these were transformed with the DNN TR network. A total of 1.5 × 10 7 CEST profiles were used for training, which was done over 110 epochs, with a batch Absolute value of a normal distribution: μ = 1.0 s −1 , σ = 2.0 s −1 R 2,H (E 1 ) − R 2,H (G) and R 2,H (E 2 ) − R 2,H (G) Normal distribution: μ = 0.0 s −1 , σ = 2.0 s −1 Chemical shifts ω G , ω E1 , and ω E2 Uniform distribution over the full sweep width Chemical exchange Probability of three-site exchange 25% k ex,GE , k ex,GE1 , k ex,GE2 10-300 s −1 p E , p E1 , p E2 0.01-0.15 size of 128. The stochastic ADAM (Kingma and Ba 2014) optimiser with standard parameters and a learning rate of 3.3 × 10 -4 was used.
Initial training was carried out using a desktop computer (Intel Core I7-6900 K, 3.2 GHz, 64 GB RAM), equipped with an NVIDIA GeForce GTX 1080 TI GPU graphics card and subsequent training carried out using the CAMP cluster (NVIDIA Tesla V100 GPU). Although the training of the two DNNs has benefitted from access to nodes with GPUs, using the trained DNNs to transform new (experimental) data does not require high-end computational nodes or GPUs. As an example, the full set of ca. 140 1 H anti-phase CEST profiles from L99A T4 Lysozyme, Fig. 5, can be transformed with both DNN TR and DNN CS in less than 2 min on a standard laptop using only the CPU (Intel i7-6700 CPU).

Experimental amide-proton CEST data
A 1.5 mM U-[ 15 N, 2 H] L99A T4L sample produced as described previously (Bouvignies et al. 2011) and dissolved in 50 mM sodium phosphate, 25 mM NaCl, 2 mM EDTA, 2 mM NaN 3 , pH 5.5, 90%H 2 O/10%D 2 O was used to record the anti-phase 1 H N CEST experiments. L99A T4L anti-phase 1 H N CEST experiments were performed as described previously (Yuwen et al. 2017a). Briefly, the experiments were measured on a 800 MHz Bruker spectrometer equipped with an x, y, z-gradient cryogenically cooled probe. 1 H N -CEST measurements were performed with a B 1 field of 30.5 Hz at 282 K using a CEST delay of T ex = 400 ms. A range of 1 H offsets on a regular grid from 6.5 to 9.5 ppm was used, with step sizes of 30 Hz. An additional reference 2D dataset was obtained by setting the B 1 offset to − 12 kHz. A 1.35 mM sample of [U-15 N, 2 H; Ileδ 1 -13 CHD 2 ; Leu, Val-13 CHD 2 / 13 CHD 2 ; Met-13 CHD 2 ] G48A Fyn SH3 domain was prepared as described previously (Yuwen et al. 2017a). The sample was dissolved in 50 mM sodium phosphate, 0.2 mM EDTA, 0.05% NaN 3 , pH 7.0, 90% H 2 O/10% D 2 O. 1 H N CEST experiments were measured for the G48A Fyn SH3 domain using a 600 MHz Bruker spectrometer at 285 K (x, y, z-gradient cryogenically cooled probe). The 1 H N CEST datasets were recorded as described previously (Yuwen et al. 2017a); specifically, a pair of datasets was recorded using B 1 fields of 26.7 Hz and 42.0 Hz. A CEST delay of T ex = 400 ms was used and B 1 offsets between 5.5 and 10.5 ppm with step sizes of 25 Hz (B 1 = 26.7 Hz) or 40 Hz (B 1 = 42 Hz) were recorded. In addition, a 2D reference dataset was obtained with a B 1 offset of − 12 kHz that is equivalent to setting B 1 = 0 Hz.

Results and discussion
Chemical exchange saturation transfer profiles are normally visualised and analysed as, I(ω offset )/I 0 , where I(ω offset ) is the intensity observed for a given site when a weak radio-frequency pulse (B 1 ) is applied at a frequency of ω offset , and I 0 is the corresponding intensity with no B 1 pulse applied. A feature of standard CEST profiles is that they resemble inverted one-dimensional NMR spectra, where the 'dips' are centered at the chemical shifts of the exchanging species. Thus, the related CEST profile, max(I/I 0 ) − I/I 0 , resembles a simple NMR spectrum and its real Fourier transform therefore resembles an FID. Analysis of the CEST profiles with DNNs shown below first involved transformation of the data into the time domain, through a real Fourier transform, Fig. 1A and B. It should be noted that for a real Fourier transform, or equivalently a discrete Fourier transform of pure real data (N data points), the output is Hermitian-symmetric and approximately half [N/2 − 1 for even N and (N − 1)/2 for odd N] of the points are therefore redundant, see Supporting Material and Fig. S1.
To show the strength of the developed DNNs for the analysis of CEST data, we consider the amide-proton anti-phase CEST (Yuwen et al. 2017a), whose profiles are complicated relative to those generated by other CEST experiments since the 'dips' are anti-phase in nature (i.e., multiplet components from the scalar coupling between one-bond 1 H-X spins are of opposite phase). These CEST profiles are challenging to analyse primarily because the chemical shifts may not be easily accessible directly from the profiles. To facilitate the analysis of amide-proton CEST profiles the overall process is divided into two tasks, each with their own optimal DNN. The first DNN, DNN TR , transforms each anti-phase CEST profile into a 'classical' profile, where the doublet nature of the dips are eliminated, thereby improving resolution, and also upsamples the profile to a fixed number of points in the CEST dimension. The second DNN, DNN CS , then determines the 1 H chemical shifts for each of the exchanging species and an associated confidence in the shift values.

A deep neural network for the transformation of amide-proton CEST profiles
It was recently shown how each of the hidden layers of a simple DNN can be mapped to specific mathematical transformations (Amey et al. 2021). Such an approach is naturally highly attractive in order to design DNNs for new challenges and to understand their strengths and weaknesses. However, with the large size of recent networks developed to analyse and transform NMR data, our focus here is on employing architectures that have been shown recently to work well for related tasks. We have previously developed DNNs using the FID-Net architecture  to decouple and analyse NMR spectra ) by using FIDs as input. Since amide-proton anti-phase CEST profiles resemble anti-phase one-dimensional NMR spectra, our rationale was that a DNN similar to FID-Net can be trained to transform anti-phase CEST profiles into 'decoupled' standard CEST profiles. Thus, the DNN TR architecture used was built of two modules, a module akin to a block in the FID-Net architecture (Karunanithy and Hansen 2021) and a modified long short-term memory (LSTM) module (Hansen 2019). The architecture is described in detail in Supporting Material, Fig. S2, where the python code for generating the model in Tensorflow/Keras (Chollet 2015;Abadi et al. 2016) is also provided. The theory for spin-evolution during CEST experiments is well-established (Helgstrand et al. 2000;Hansen et al. 2008;Vallurupalli et al. 2012), and synthetic training data can therefore easily be generated by propagating the Liouvillian over the desired element.
The first DNN, referred to as DNN TR , was trained to transform an input amide-proton anti-phase CEST profile to the hypothetical CEST profile of an isolated 1 H spin, with 1 J HN = 0 Hz, Fig. 1. Thus, DNN TR decouples the anti-phase amide proton CEST profile and upsamples it to 128 points. The upsampling to a constant size, in this case 128 real points, makes the prediction of chemical shifts with a second DNN feasible, since DNNs are typically trained with a constant size of the input and output data (see below). A maximum of three exchanging states was assumed and only the forked three-site exchange model was used to generate the data, that is, E 1 ⇌ G ⇌ E 2 , where E 1 and E 2 are sparsely populated states. For 75% of the training data the population of E 2 was set to zero. Because of the strong correlation between CEST data reporting on different three-site exchange models, for example, E 1 ⇌ G ⇌ E 2 versus G ⇌ E 1 ⇌ E 2 , it is anticipated that DNN TR will robustly transform anti-phase CEST profiles derived from any three-site exchange process. Briefly, DNN TR was trained on 15 × 10 6 CEST profiles, where the range of training data is indicated in Table 1. The loss function was calculated from the mean-squared-error between the transformed in-phase CEST profile and the target function, see Fig. 1D. The network was trained to The trained DNN TR network was evaluated separately on synthetic data for two-and three-site exchanging systems. Figure 2 shows the evaluation on 100,000 randomly generated CEST profiles for two-( Fig. 2A) and three-site (Fig. 2B) exchanging systems. Figure S3 shows the performance of the DNN transformation as a function of the strength of the weak field, B 1 , the population of the sparse state E, p E , the overall exchange rate, k ex (k ex = k GE + k EG , for two-site interconversion) and the number of sampled offsets. The transformation of profiles from anti-phase to in-phase by the DNN TR network is robust and there is only limited variation in the performance with different parameters used to generate the CEST profiles. Of particular interest is that the transformation is only minimally affected by the number of points sampled in the input profile, Fig. S3D, suggesting that the upsampling is robust.
Having evaluated the DNN TR network on synthetic data it is important to assess how the DNN performs on experimental anti-phase 1 H N CEST profiles. Figure 3 shows two examples, where 1 H N anti-phase CEST profiles for the L99A mutant of T4 lysozyme recorded at 18.8 T have been transformed to in-phase CEST profiles (with the scalar coupling removed). This representation immediately allows estimation of the chemical shifts of 1 H nuclei of the exchanging states, which can be used as initial parameters for a least-squares analysis. However, these experimental CEST profiles are associated with uncertainty and since the ground truth (exact value) is not known a detailed evaluation of the performance is not directly possible.

Determining 1 H chemical shifts in exchanging states using a deep neural network
With the in-phase CEST profiles available it becomes substantially easier to estimate the chemical shifts of the exchanging species. DNNs are particularly adept at locating specific features in data, for example, localising particular elements in an image. Thus, it is expected that a DNN could be trained to determine the position of peaks in one-dimensional NMR spectra and, consequently, trained to determine the chemical shifts of the exchanging species from in-phase CEST profiles or the related profiles, max(I/I 0 ) − I/I 0 . The densely connected convolutional neural network architecture (Huang et al. 2016), which was originally developed for object recognition tasks, was adapted here, Fig. S4, to determine the chemical shifts from CEST profiles. Moreover, our goal was not only to determine the chemical shifts of the interconverting conformers, but to also train the DNN to estimate the uncertainties with which it determined these shifts, thereby providing an output similar to a traditional least-squares fitting procedure.
The output from a DNN is typically a fixed length and a decision about the maximum number of exchanging states therefore has to be made before training the network. Since the time for training the DNN increases rapidly when increasing the maximum number of exchanging states, we chose for this application to only focus on CEST profiles reporting on three or less states, which covers most of the CEST-based studies reported to date. For a maximum of three exchanging states the output from the DNN CS network is a 3 × 2 matrix whose elements are three chemical shifts, f ω,pred , and their corresponding confidences, c pred . When the input CEST profile derives from a two-site exchanging system, the DNN should report one confidence approaching zero and when the input CEST profile is only reporting on one state, two of the confidences should tend to zero.
To facilitate an end-to-end analysis, that is chemical shifts and their uncertainties obtained directly from the experimental anti-phase CEST profiles, the network to determine chemical shifts was trained on outputs from DNN TR , i.e. in-phase CEST profiles generated from anti-phase profiles. Having the second DNN, referred to as DNN CS , determine both chemical shifts and their confidences requires special attention to the loss function used for training. Naturally, the DNN CS network should be trained to optimise the confidence and thus obtain as accurate peak positions as possible, however, it should also be penalised, when the predicted confidence does not match the accuracy of the predicted chemical shifts. A variety of DNN architectures and loss functions have previously been designed to provide measures of the uncertainty with which DNNs make their predictions and transformations, also for the predictions of chemical shifts (Jonas and Kuhn 2019). As detailed below, we have adopted a strategy, where the loss function bears resemblance with the cost function in a least-squares fitting procedure.
The last layer of DNN CS has sigmoidal activation, Fig.  S4, which means that the output values, three values reporting on chemical shifts and three confidences, are between 0 and 1. The predicted chemical shifts in the range (0, 1), referred to as f ω,pred , are easily converted into the range of offsets obtained in the CEST dimension of the original data using a linear mapping. For example, if the CEST profile is recorded with points between 6.6 ppm and 10.0 ppm, then the linear mapping will be δ ← 3.4 ppm × f ω,pred + 6.6 ppm. Moreover, a predicted uncertainty, σ pred , was calculated from the predicted confidence as σ pred = k (1/c pred − 1), where k is a constant and σ pred structured such that it can take values between 0 and infinity. In order to make the predicted uncertainties match actual uncertainties of the prediction, the first part of the loss function was defined in a manner similar to a standard χ 2 , that is: where the sum is over the three states. The constant k was initially set to 1 during training, and subsequently set to � max � 1 H offsets � − min � 1 H offsets ��√ L freq to rescale L freq to have an expectation value of 1 and so that σ pred reports on the expected uncertainty. The purpose of the loss function in Eq. (1) is to make the predicted chemical shifts approach their true values. However, if L freq was the only loss function used during training, then training of DNN CS would simply lead to very low confidences (high uncertainties), which would minimise the function in Eq. (1). A second loss function was therefore added during training: where, 1 = {1,1,1} for three-state exchange input and 1 = {1,1,0} in the case of two-state exchange, thereby allowing large uncertainties, σ pred , when a state is not present in the input. The loss function in Eq. (2) serves to force DNN CS to predict high confidences (low uncertainties) where, and only where, the input profiles report on a real state. Briefly, the DNN CS network was trained on 1.5 × 10 7 randomly generated CEST profiles, with a final value of L freq = 7.3 × 10 -5 , and L uncer = 2.8 × 10 -4 . For the synthetic CEST data analysed below, the range of 1 H offsets was 3.4 ppm and therefore k = 0.029 ppm. Full details of the network architecture and the training are provided in the Methods and Supporting Information sections.
It is anticipated that with minimal additional training, the DNN CS network will be able to accurately analyse common 'in-phase' CEST profiles such as those often obtained for 1 3 15 N and 13 C, since these CEST profiles strongly resemble the IP-CEST profiles, Fig. 3. However, it should be stressed that the current DNN CS network has only been fully assessed with 1 H AP-CEST profiles that have been transformed with DNN TR .

End-to-end one-shot analysis of amide proton CEST
The two DNNs, DNN TR and DNN CS , described above can be applied sequentially to provide an end-to-end one-shot analysis of anti-phase CEST profiles: The overall performance of this sequential DNN was first evaluated using synthetically generated data. Specifically, (i) 100,000 anti-phase CEST profiles were generated for a variety of two-site chemical exchange processes and a further 100,000 profiles for three-site exchange. The range of B 1 offsets used was 3.4 ppm for all profiles, and all other input parameters are given in Table 1. (ii) Random gaussian noise with a standard deviation of 0.01 of the maximum value of each anti-phase CEST profile was added to the input anti-phase CEST spectrum. (iii) The DNN TR network was first used to transform all the CEST profiles from anti-phase to in-phase. (iv) The second network, DNN CS , was used to determine the chemical shifts of the exchanging states and their associated uncertainties. Figure 4 shows a summary of the quantitative assessment of the 100,000 CEST profiles corresponding to a two-state chemical exchange process. From Fig. 4 it is clear that the sequential DNN is able to accurately predict the chemical shifts of exchanging states from anti-phase CEST profiles. From the chemical shift predictions made on the 100,000 random CEST profiles the difference between a predicted chemical shift, δ pred , and a true chemical shift, δ true , was calculated, which gives an estimate of the performance and the confidence levels of the DNN as a function of c pred and σ pred . Importantly, as shown in Fig. 4C and D, the DNN has also successfully been trained to predict the uncertainty associated with the predicted chemical shifts. Specifically, for c pred ≥ 0.4, the predicted uncertainty, σ pred , agrees well with the 68.3% confidence level estimated from the analysis of the 100,000 profiles. For c pred < 0.4, σ pred is no longer an accurate measure of the uncertainty. Not surprisingly, the ground state chemical shifts, Fig. 4E, are generally predicted with a higher accuracy than the chemical shifts of the low-populated state, Fig. 4F, where lower confidences are obtained for small chemical shift differences between the two states, see Fig S5. The corresponding assessment carried out on 100,000 synthetic anti-phase 1 H N CEST profiles reporting on a three-site chemical exchange process, E 1 ⇌ G ⇌ E 2 , is shown in Supporting Material, Fig. S6. Figure S7 shows the summary of evaluations where random gaussian noise with a standard deviation of 0.01, 0.02, 0.04 of the maximum value of each anti-phase CEST profile was added to the input anti-phase CEST spectrum. The performance of the sequential DNN shown above strictly only holds for the ranges of data that were used for training and for the quantitative assessments, Table 1. However, as shown below, the performance of the DNN is rather robust and if the parameters of the CEST profile to be analysed deviate only slightly from the training parameters one would still expect the analysis to be valid. The ranges of parameters shown in Table 1 cover those obtained in most of CEST-based studies to date and it is therefore expected that most experimental anti-phase CEST profiles can be accurately analysed using the DNNs. Fig. 4 Quantitative assessment using 100,000 synthetic anti-phase 1 H N CEST profiles reporting on a two-site chemical exchange and analysed using the sequential DNN to determine the chemical shifts of nuclei from the exchanging states. A, C and E assessment of the ground-state predictions, B, D and F assessment of the predictions of the low-populated state. A and B show differences between predicted (δ pred ) and true (δ true ) chemical shift values, versus c pred for the 100,000 analysed CEST profiles (red dots). The full-drawn line corresponds to the average and the dashed lines correspond to the standard confidence levels, 68.3%, 95.4%, and 99.7%, respectively. C and D show 2D histograms of the points in (A) and (B); that is, a 2D histogram of the differences between δ pred and δ true versus the predicted confidence, c pred . The histogram was calculated with a resolution of 0.05 along c pred and 0.005 ppm along δ true − δ pred . The blue dashed lines show the predicted uncertainty, σ pred = 0.029(1/c pred − 1), which for c pred > 0.4 agrees well with the confidence levels obtained from the analysis of the 100,000 profiles. E and F shows the distributions of uncertainties obtained from the assessment

Assessment of the sequential DNN to analyse experimental CEST profiles
Experimental anti-phase 1 H CEST profiles for the L99A mutant of T4 lysozyme were analysed using the sequential and stacked DNN to gain insight into its performance on experimental data. As a validation of the performance of the fully stacked DNN two analyses were performed: in the first all of the 86 B 1 offsets were used to predict chemical shifts, while in the second, half of the offsets (every second point) were removed. Figure 5A shows the example of Gly12, where the predicted chemical shifts and uncertainties using half of the B 1 offsets agrees well with the values obtained using the full dataset. Generally, this holds for all sites, Fig. 5B and the RMSDs obtained are in line with those expected from the predicted uncertainties, σ pred . The differences in chemical shifts based on analyses of the full and half datasets, for all profiles, as a function of the confidence level are highlighted in Fig. 5C. Finally, it should be noted that the DNN TR network was only trained on profiles with 50-128 input points. The fact that the stacked DNN is able to accurately predict the chemical shifts from profiles with less data (43 points) than those used for training points to the robustness of the DNN.
To further assess the performance of the stacked DNNs in determining the chemical shifts of the exchanging states, anti-phase CEST profiles were obtained for the G48A mutant of the SH3 domain from Fyn (Yuwen et al. 2017a). At a static magnetic field of 14.1 T, two sets of data were obtained with B 1 fields of 26.7 Hz and 42 Hz. Figure 6A shows that the chemical shifts predicted using the stacked DNNs, independently, on the two different datasets agree well (RMSD of 7 ppb), and Fig. 6B highlights the difference in shifts based on the separate analyses of the two full datasets. Subsequently, the two experimental datasets were analysed simultaneously using a standard least-squares analysis (Yuwen et al. 2017a) with the software package ChemEx (https:// github. com/ gbouv ignies/ chemex) and the results were compared with the predictions made by the DNN, Fig. 6C. Again, the agreement between the chemical shifts predicted by the DNN and those obtained by least-squares fitting agree well, with an RMSD of 7 ppb.
Uncertainties obtained from the covariance matrix in a least-squares analysis of CEST profiles are typically around 1 ppb, which is 6 times smaller than the uncertainties obtained from the DNN, indicating that the stacked DNNs have not fully reached the level of accuracy obtained by least-squares fitting. Still, the predictions obtained from the analysis with the stacked DNNs are of an accuracy where they can be used for downstream analyses and are well beyond the level of accuracy by which these shifts can be predicted from a high-resolution structure (Han et al. 2011). Alternatively, the DNN-predicted chemical shifts can serve as excellent starting parameters for Predicting the chemical shifts of exchanging states of L99A T4 Lysozyme. The DNN for the transformation of anti-phase to inphase profiles, DNN TR , upsampled the recorded data to 128 points and DNN CS determined the chemical shifts. Two full analyses were performed: One on the original 86 points recorded and another analysis on half of the data. A Analysis of the anti-phase profile for Gly12 1 H N emphasizes the robustness by which the sequential DNN determines chemical shifts and their predicted uncertainties. B Consistency plot showing excellent agreement between the chemical shifts determined from the full dataset (x-axis) and half of the data (y-axis).
Only data for which c pred > 0.4 are shown. C Differences between the predicted chemical shifts from the full dataset (δ pred [86 pts]) and half of the data (δ pred [43 pts]) versus the minimum of the confidence, min(c pred ) = min(c pred [86 pts], c pred [43 pts]). All data are included a subsequent least-squares analysis. It is also possible that larger or alternative DNN architectures along with longer training periods could improve the performance of the DNN predictions.

Conclusions
A deep neural network was developed and trained to determine amide proton chemical shifts of exchanging states from anti-phase 1 H N CEST profiles. The approach first leads to the conversion of anti-phase to in-phase 1 H N CEST profiles, whereafter the chemical shifts are predicted along with their uncertainties. Compared with other analysis tools, the DNN does not require any additional training and there are no user adjustable parameters, which makes the analysis autonomous and suitable for automated processing pipelines. Thus far, the DNN only predicts chemical shifts. If additional parameters are sought, such as exchange rates and populations, the output shift values from the DNN can then serve as excellent starting points for a least-squares fitting procedure. The methodology and DNNs presented here add to the growing applications of deep learning and artificial intelligence for the analysis of NMR data, and provide an example of the autonomous analysis of complex NMR data reporting on macromolecular dynamics and chemical exchange.
included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.