1 Introduction

According to the Worldometer [1], in the Fall and Winter of 2020–2021 (from Nov. 17, 2020, until Feb. 20, 2021), the total deaths in the United States from COVID increased by 250,000. In the Fall and Winter of 2021–2022 (from Aug. 21, 2021, until Mar. 3, 2022), there was another increase of 1/3 to 330,000. Then from Aug. 21, 2022 until Mar. 12, 2023, the number decreased abruptly by 3/4, falling 330,000 to only 80,000. The abrupt decrease was not caused only by the amazing vaccine that had been widely available after Spring 2021, which had already limited the 2021–2022 increase. The rapid decrease after summer 2022 suggests a possible phase transition in the molecular structure of COVID Spikes. Here we show a phase-transition-based explanation for this abrupt decrease, based on earlier papers that analyzed the 1200—COVID Spike amino acid sequences from 2003 until BA.5, 2022 [2,3,4,5,6,7].

2 Methods

The thermodynamic methods used in this paper are drawn from a wide range of disciplines. While these have been discussed in earlier articles [2,3,4,5,6,7], a summary of them is available on arXiv [8]. The summary involves a minimum of technical background. While phase transitions are a familiar part of daily life (boiling or freezing water), the methods of statistical mechanics involving phase transitions are well known mainly to chemists and physicists. Note that the new path was identified in Ref. [7] using no new parameters in phase transition theory. Identification was natural, in the context of two distinct phases (domain synchronization, from Cov03 to Cov19 to Omicron, and a second phase, with stronger binding of the Receptor Binding Domain later).

The hydrophobic landscape of a protein is allowed by the hydropathic profile ψ (aa,W) when the window of length W is optimized, as summarized in Ref. [8]. Thus, given a protein hydropathic profile ψ(aa,W), one notices immediately that it has two kinds of extrema: hydrophobic maxima and hydrophilic minima. It is natural to suppose that the maxima act as pivots or pinning points for the conformational motion that is functionally significant, while the minima act as hinge points. The hydropathic profile trends are closely associated with the function and evolution of proteins.

3 Results

The Spike (S) protein of SARS-CoV-2, which plays a key role in the receptor recognition and cell membrane fusion process, is long and slender, and immersed in water. It is composed of two subunits, S1 and S2. The S1 subunit contains a Receptor Binding Domain (RBD) that recognizes and binds to the host receptor angiotensin-converting enzyme 2, while the S2 subunit mediates viral cell membrane fusion. Spike dynamics were explained [4,5,6,7] in terms of domain motions synchronized through leveling of hydrophilic and/or hydrophobic extrema in sliding hydropathic windows. The key extrema are in or near the RBD. These extrema were shown to correlate well with COVID evolution only when interactions of amino acids with water were averaged over sliding windows of lengths W ~ 35–39 amino acids. Thus, the common practice of focusing on individual mutations (W = 1 or 3, including nearest neighbors) has not succeeded in explaining COVID evolution. The RBD (437–508) is ~ 70 amino acids wide, so it is not surprising that the effects of domain synchronization are best resolved with W ~ 35–39.

There was a large increase in extrema leveling from 2003 to 2019, resulting in greater contagiousness, in accordance with natural selection. This long-term trend was abruptly reversed with the new 2022 variant BA.5, described predictively as a “New Twist” in Aug. 2022 [7]. Specifically the largest changes in 2022 were caused by a few mutations that increased two hydrophobic (numbered 7 and 8, centered at 369 ± 6 for W ± 10, and 491 ± 6 for W ± 10) and one (number 1, centered at 448 ± 5 for W ± 10) hydrophilic (W = 39, as in [5]) extrema in the ~ 70 amino acid RCB (see Fig. 1, copied here for the reader’s convenience). These changes were largest in edge 8, which increased from 169.4 (Omicron) to 171.9 (BA.2) to 172.4 (BA.5). This trend was continued in early 2023 strains labeled XBB, with one mutation S486P [9]. This single mutation increased edge 8 to 172.9, in excellent agreement with the trend of edge 8 identified earlier. Finally the single mutation in late 2023, F456L of EG.5, deepens the hydrophilic edge 1, from 138.5 in Omicron, to 138.0 (see Table 1 of [7]). Note that the two new mutations are less than 7 sites away from their respective centers, and thus not only lie well within the W = 39 sliding window; they would have been near the domain centers for any W as small as 15.

Fig. 1
figure 1

The hydropathic wave profiles of Omicron and BA.5 are very similar, and the most important changes are in edges 7, 8, and 1

While this double success is gratifying, how likely is it to be only accidental? A W = 15 window associated with extremum 8 is the fraction ~ 15/1200 = 5/400 of the Spike sequence. Also the mutation should increase an average value of 172, and this occurs on average 0.4. Thus, the odds against this success happening accidentally are more than 100 to 1. The second success is equally unlikely, so the odds against the two combined successes are more than 10,000 to 1. Moreover, Prolines stiffen proteins mechanically, because Pro is the only amino acid which is connected twice to the peptide backbone [7]. The fraction of Prolines in all proteins is around 6%, and in Spikes only 4.4%. However, in window 8 of BA.5, it is already 10%. Adding one more Pro in XBB brings this fraction to 12%, nearly three times that of the entire Spike.

It was suggested in Ref. [7] that increases in a few extrema in the RBD could increase contagiousness by increasing viral attachment in the upper respiratory tract in S1. Moreover, cell fusion through S2 involves an intermediate state in which the binding of S1 to the substrate is weakened [10]. This is less likely to occur in BA.5 than in BA.2 [7]. The stiffening effect of the mutation S486P in XBB strains clearly strengthens binding to the RBD, but what of the flexing effect in EG.5 with F456L? Here we see in Fig. 1 that the deeply hydrophilic edge 1 separates the two hydrophobic edges 7 and 8. Thus, edge 1 can act as a hinge speeding cooperative pinning by edges 7 and 8 during the transition state of the Spike to tissue. Thus, combined hydrophobic and hydrophilic mechanical stabilization of binding in the upper respiratory tract is may be the most natural explanation for the abrupt decrease in deaths, as BA.5 spread rapidly in the United States in Fall 2022 [11].

4 Discussion

The successes of hydropathic methods (against odds of > 10,000 to 1) in describing the BKK and EG.5 single mutations may be surprising. This is less so when one realizes that these successes emerged from general theories regarding the tendency of self-organized networks to evolve toward criticality and thus approach phase transitions [8]. Protein topologies are compactly described by their amino acid sequences, > 107 of which are conveniently available online at Uniprot. At first glance, these sequences look like mysterious coded messages. The general theory represents what we believe to be the most spectacular example of code breaking [8]. The input to the theory of large-scale second-order phase transitions is based on a hydropathicity scale developed from studying the W-dependent amino acid contributions to the curvatures of > 5000 protein segments [12]. The successes described here in connecting this structural data to pandemic evolution have occurred through combined international [12] and interdisciplinary efforts.

What does the future hold? It seems unlikely that natural selection will repeat leveling of extrema, as occurred before (2003–2022). The XBB and EG.5 mutations strengthened edges 8 and 1, perhaps the next mutation will strengthen edge 7. There has been further mutational tightening of binding to the RBD and further steep decreases in the US death rate, to 60,000/year (2022–2023) and only 15,000/year in 2023–2024, compared to 15,000/year from flu.

The abrupt end to the COVID pandemic is reminiscent of the abrupt end to the 1918 flu pandemic (1917–1919) [13]. However, while there is some sequential data on the 1918 and later H1N1 flus, it is not easy to identify simple patterns in their evolution.

An alternative study focused on a few other mutations near the S1/S2 cleavage site ~ 685 [14], which might have explained the abrupt decrease in the death rate from Omicron to BA.5. However, the quantitative features identified here are associated with single mutational improvements 456 and 486 in binding in the RBD, far from the S1/S2 cleavage at 685.

Broad perspectives on the problems involved in analyzing protein evolution are discussed in several recent articles [15, 16]. Phase transition theory has been widely discussed [17] since the pioneering work of Bak [18]. However, because of interdisciplinary barriers, applications to proteins have been few. No one expects that the successes reported here can be explained in a couple of extra paragraphs, but in several hours, a reader of Ref. [8] will see how general these methods are, and how widely they could be applied to problems of general interest in protein dynamics.