Linus Pauling’s intuition

In 1938, the theoretical physicist Pascal Jordan published his ideas about quantum–mechanical stabilizing interactions between identical or nearly identical molecules or parts of molecules that influenced biological processes [1]. According to Jordan, identical molecules or parts of molecules tended to stick together and in this he found explanation of biological replication of organisms yielding exact copies of themselves. This puzzled the physicist turned biologist Max Delbrück who called Linus Pauling’s attention to the paper. Pauling drafted a note and invited Delbrück in co-authorship. They found Jordan’s suggestion nonsensical and their brief communication has, eventually, become a fundamental contribution to modern structural science [2]. They wrote: “Attractive forces between molecules vary inversely with a power of the distance, and maximum stability of a complex is achieved by bringing the molecules as close together as possible, in such a way that positively charged groups are brought near to negatively charged groups, electric dipoles are brought into suitable mutual orientations, etc. The minimum distances of approach of atoms are determined by their repulsive potentials, which may be expressed in terms of van der Waals radii; in order to achieve maximum stability, the two molecules must have complementary surfaces, like die and coin, and also complementary distribution of active groups.” In 1948, Pauling stressed the same principles in a lecture specifically referring to molecular replication [3]. He spoke as if anticipating the mechanism of DNA function whose structure would not be discovered until five years later.

DNA: from dullness to information-carrier

For decades in the twentieth century, scientists were searching for the substance of heredity. The notion that it should be found among proteins persisted for a long time. The view about nucleic acids was that their tetranucleotide structure, as hypothesized by Phoebus Levene in 1909, precluded it from acting as transmitter of genetic information (see, e.g., [4]). On the other hand Oswald Avery and his two associates published a careful study in 1944, in which they identified DNA as the “transforming principle” [5]. The scientific community as a whole may have not been ready to accept Avery et al.’s discovery, but there was at least one scientist, Erwin Chargaff (1905–2002, Fig. 1), on whom it had a profound effect. He was an internationally renowned biochemist at the College of Physicians and Surgeons, Columbia University (see, e.g., [6]). He had a “life-long fascination with the appearances of life, with its immense diversity, its majestic uniformity” [7]. Upon having read Avery et al.’s paper, Chargaff made a bold decision. He cleared his desk of all his ongoing projects and embarked on a systematic analysis of DNA in the cells of a diverse set of living organisms. Luckily, he could use the recently invented paper chromatography, which made the analysis of minute amounts of material possible.

Fig. 1
figure 1

Erwin Chargaff, 1994, at his Manhattan home (photograph by Istvan Hargittai)

Chargaff found that the DNAs of different organisms were different. This observation brought down the tetranucleotide hypothesis. His findings pointed to the possibility of the DNA molecules to be the carrier of biological information that distinguished one organism from another. That different organisms contain DNAs of different composition was a milestone discovery. It was also rather straightforward: the data pointed to it unambiguously. Furthermore, the data from different organs of the same organisms showed consistency in that one DNA composition was characteristic for the entire organism.

Discovery of base equivalence

Chargaff’s next discovery was preceded by a great deal of additional analyses whose conclusion was far from obvious. He measured the amounts of the individual bases in each DNA and compared the contents. A pattern seemed emerging, however hesitantly, from the raw data. The total amounts of the purine bases and the total amounts of the pyrimidine bases in terms of mole quantities appeared to be the same. What was even more extraordinary that the ratio of adenine to thymine and guanine to cytosine, again in mole quantities, were close to one. However, the ratios scattered considerably about 1, in some cases, up to about thirty per cent. It took a great deal of contemplation to decide that there was indeed a pattern and a great deal of courage to come out with it publicly. In his 1950 review, Chargaff wrote: “It is noteworthy—whether this is more than accidental, cannot yet be said—that in all desoxypentose nucleic acids examined thus far the molar ratios of total purines to total pyrimidines, and also of adenine to thymine and of guanine to cytosine, were not far from 1” [8]. Three decades later, he noted: “I felt a great reluctance to accept such regularities, since it had been impressed on me that our search for harmony, for an easily perceived and pleasing harmony, could only serve to distort or gloss over the intricacies of nature” [9]. Yet Chargaff could not help but notice that “there emerged—like Botticelli’s Venus on the shell, though not quite as flawless—the regularities that I then used to call the complementary relationships and that are now known as base pairing” [9].

In hindsight, Chargaff’s establishing base pairing, or, as he called it initially, base equivalence, strikes us as almost a trivial conclusion. It has become so much common knowledge that mentioning it no longer requires reference to Chargaff’s discovery. Once he accepted his own discovery, he realized that it was also a brilliant case of complementarity. We add that it was also a manifestation of what Pauling intuited less than a decade before. In an essay about symmetry, in 1989, four decades after the original discovery, Chargaff returned to the issue of complementarity: “In meditating about the processes of life, one encounters another phenomenon, perhaps equally important as, but less obvious than, that of symmetry. It is what I have often referred to as complementarity (emphasis in the original). When I discovered it in DNA, I spoke of base complementarity. It is now generally called base pairing in reference to the double-helical structure model of DNA. … When scale models of the nitrogenous constituents occurring in DNA in equal molecular quantities were compared, they appeared to complement each other, producing structures of equal size. I could not help thinking of that ancient symbol of complementarity, the design used by the Chinese to depict the interaction of the Yin and the Yang, the dual forces governing the universe” [10].

Generalization versus minute deviations

Eugene P. Wigner quoted the teaching of his mentor, Michael Polanyi, about the scientific method in his 2-min speech at the Nobel award ceremony: “science begins when a body of phenomena is available which shows some coherence and regularities, that science consists in assimilating these regularities in a natural way” [11]. This teaching is valid if augmented with a caveat about the “proper level,” discussed below following Chargaff’s reservations. Otherwise, one might gloss over important differences for the sake of pronouncing general trends.

In his 1950 Experientia paper, Chargaff writes “Generalizations in science are both necessary and hazardous; they carry a semblance of finality which conceals their essentially provisional character; they drive forward, as they retard; they add, but they also take away” [12]. Elsewhere in the same paper he expresses his reservations in yet stronger terms: “There is nothing more dangerous in the natural sciences than to look for harmony, order, regularity, before the proper level is reached (emphasis by us) … The disgust for the amorphous, the ostensibly anomalous—an interesting problem in the psychology of science—has produced many theories that shrank gradually to hypotheses and then vanished” [13]. Chargaff notes specifically in relationship to nucleic acids that “minute changes in the nucleic acid, e.g., the disappearance of one guanine molecule out of a hundred, could produce far-reaching changes in the geometry of the conjugated nucleoprotein; and it is not impossible that rearrangements of this type are among the causes of the occurrence of mutations” [14].

Chargaff’s considerations embraced the concept of sequence at the time when not only sequence was impossible to determine, but it was not yet generally accepted that indeed nucleic acids were the substance of heredity. He understood that the identity of the large biological molecules, in this case, the nucleic acids, could not yet be determined. Nonetheless, as if letting his imagination race ahead, he drew conclusions under the assumption that the DNA molecules formed an essential part of the process of heredity. In this case, the sequence of the nitrogenous components and not only their proportion will have significance—this was a most prescient supposition. There are enormous numbers of nucleic acid molecules, identical in composition but allowing improbable huge numbers of different versions so far as sequence is concerned. This was a powerful example of the dangers of ignoring differences beyond a certain level: differences that might not show up at one level of representation (the composition), but become vitally important at the next (the sequence). I am drawing this, hopefully not overextended, inference from Chargaff’s discussion without his having connected it explicitly to the issue of generalization versus minute differences. Yet another inference is that examining his writings even today provides much fresh food for thought.