1 Introduction

Reactions to the death of Murray Gell-Mann, on 24 May 2019, suggest a need to give proper credit to those involved in the development of the quark model, to describe how it appeared to one of those involved,Footnote 1 and to try to lay to rest a number of myths.

In the July/August 2019 CERN Courier, the late Lars Brink rightly described Murray, who was the undisputed leader of theoretical particle physics through the 1950s and 1960s, as ‘one of the great geniuses of the twentieth century’ [20]. Reports of his death all mentioned his work on quarks, which some highlighted. Physics World, for example, headed its tribute [6] ‘Quark Pioneer Murray Gell-Mann dies’, adding in a subheading that he made ‘a number of breakthroughs including predicting the existence of quarks’, wrongly giving him sole credit for developments which his insistence, initially for good reasons, that quarks are purely ‘mathematical entities’ discouraged.

Gell-Mann created the conditions that made the discovery of quarks possible. But, according to the published record, neither he nor George Zweig, who is usually considered the co-discoverer, and deserves far more credit and recognition than he usually gets, was the first to submit a paper on the entities that Murray dubbed quarks. That honour goes to André Petermann [80] who just pipped him [44] and Zweig[95], [96], [97] at the post, but—although he was a CERN staff member—was not mentioned in the CERN Courier.

Gell-Mann—whose paper on quarks was ‘stimulated’ by Robert Serber—advocated abstracting algebraic relations from the quark model, comparing this [45] to ‘a method sometimes employed in French cuisine: a piece of pheasant meat is cooked between two slices of veal which are then thrown away’. In contrast, Serber treated quarks as real particles, and Petermann and—to a far greater extent—Zweig used what Gell-Mann was soon derisively calling the ‘naïve’ or ‘concrete’ quark model to derive results that went far beyond what could be inferred from the associated ‘current algebra’ as it became known.

figure a

Robert Serber—realised that Gell-Mann and Ne’eman’s SU(3) classification of strongly interacting particles can be explained by assuming that they are composed of three constituents, and ‘stimulated’ Gell-Mann to think further about this idea. His unpublished calculation of the magnetic moments of protons and neutrons was the first use of the ‘concrete quark model’. Courtesy AIP Emilio Segrè Visual Archives, Physics Today Collection

figure b

Murray Gell-Mann—coined the name quarks, and was the grandfather, but not the sole father, of the quark model. He insisted that quarks are mathematical entities, but later claimed that that by this he meant that they are permanently imprisoned inside the observed particles, as is now believed to be the case. Courtesy University of Chicago Photographic Archive, [apf06342], Special Collections Research Center, University of Chicago Library

figure c

George Zweig—father of the ‘Concrete Quark Model’. Photo courtesy G Zweig

figure d

André Petermann—in a paper (in French) submitted 5 days before Gell-Mann’s, derived mass formulae from a constituent model, noting that the constituents would have non-integral charges. Photo, taken in Manchester in 1954, the year after he and Stueckelberg published the first paper on the renormalization group, courtesy CERN

figure e

Richard (Dick) Dalitz—my supervisor - showed that the quark model accommodated all newly discovered particles. Portrait taken in May 1961 by Gian-Carlo Wick at Brookhaven National Laboratory. Courtesy AIP Emilio Segrè Visual Archives

figure f

James (Jim) Bjorken, generally known as bj—the only person to predict that deep inelastic electron scattering from nucleons would be like that from point like particles, who developed the physical description that became known as the parton model, before and independently of Feynman. Photo, taken in the late 1970s, Courtesy SLAC National Accelerator Laboratory

2 Prehistory and context

The prehistory of the quark model begins in 1949 with Fermi and Yang’s model [35] of the pi meson as a bound state of a nucleon and an antinucleon. They argued that the probability that all the particles that were being discovered were ‘really elementary’, as then assumed, ‘becomes less and less as their number increases’. Sakata[84] suggested extending the model to include the strange lambda baryon as a constituent of strange K mesons, but this proposal ran into trouble as more and more strange and non-strange particles were discovered. Order was imposed on the growing zoo of particles by Gell-Mann and Nishijma’s introduction of the hypercharge label, and Gell-Mann and Ne’eman’s use of the mathematical group SU(3)Footnote 2 to classify the known baryons and mesons as members of families of eight related particles. These steps are analogous to the realisation that chemical elements should be labelled by their atomic numbers and Mendeleev’s invention of the periodic table. The SU(3) scheme made a number of successful predictions, and was generally accepted following the discovery in 1964 of the spin 3/2 Ω baryon, with the mass predicted by Gell-Mann, who had postulated its existence as the missing member of a tenfold SU(3) family.

No physical particles were initially assigned to the fundamental threefold (triplet) SU(3) family which we now know houses the three lightest quarks. Rajasekaran has reported that when Gell-Mann was lecturing on the ‘eightfold way’, as he called his classification, at a summer school in Bangalore in 1961, Dick Dalitz—who was also speaking in the school—repeatedly asked him why he was ignoring the triplets, but Gell-Mann evaded the question [82]. Two years elapsed before the idea of using the triplet was taken up, and over ten more years then passed before the idea that hadrons are made of quarks and gluons (that hold the quarks together) became generally accepted.Footnote 3 This was because the very idea that hadrons have fundamental constituents had fallen out of favour.

As the number of observed hadrons proliferated in the 1950s and early 1960s, it came to be thought that none enjoys a special status as ‘elementary’, but rather they are made of each other, in a ‘bootstrap model’, in which—it was hoped—their properties would be determined by self-consistency. In 1961, Geoffrey Chew, the leader of this ‘nuclear democracy’ movement, stated [23] that he believed the ‘conventional association of fields with strongly interacting particles to be empty’, and that with respect to strong interactions field theory was not only ‘sterile’ but ‘like an old soldier, is destined not to die but just to fade away’. There was no place for aristocratic quarks in this philosophy, to which many—perhaps most—particle physicists then subscribed.

3 The birth of quarks

At the end of his paper on quarks [44] Murray Gell-Mann wrote ‘These ideas were developed during a visit to Columbia University in March 1963; the author would like to thank Professor Robert Serber for simulating them’. Gell-Mann and Serber have given differing accounts of what happened.

In his memoirs[87], Serber writes that a couple of weeks earlier, in order to prepare themselves for the colloquium that Gell-Mann was scheduled to deliver, he and his colleagues asked Gian-Carlo Wick to give a talk about the irreducible representations of the SU(3) symmetry group. The next day, it occurred to Serber that he could reproduce Gell-Mann and Ne’eman’s SU(3) families ‘by a low-brow method by considering a particle now called a quark that could exist in three states…The suggestion was immediate: the baryons and mesons were not themselves elementary particles but were made of quarks—the baryons of three quarks, the mesons of quark and anti-quark’. On the day of the talk:

‘Before Murray's colloquium, I took him to lunch at Columbia's Faculty Club and explained this idea to him. He asked what the charges of my particles were, which was something I hadn’t looked at. He got out a pencil and on a paper napkin figured it out in a couple of minutes. The charges would be + 2/3 and −1/3 proton charges – an appalling result. During the colloquium Murray mentioned the idea and it was discussed at coffee afterwards…. Bacqui [sic] Beg … says he recalls that … Murray had said that the existence of such a particle will be a strange quirk of nature, and quirk was jokingly transformed into quark’.

Gell-Mann, on the other hand, recalled many years later [48] that:

‘On a visit to Columbia, I was asked by Bob Serber why I didn’t postulate a triplet of what we would now call SU(3) of flavor, making use of my relation 3 × 3 × 3 = 1+ 8 +8 + 10 to explain baryon octets, decimets, and singlets. I explained to him that I had tried it. I showed him on a napkin (at the Columbia Faculty Club, I believe) that the electric charges would come out + 2/3, - 1/3, - 1/3 for the fundamental objects’.

These accounts can only be reconciled if Gell-Mann’s question to Serber about charges was rhetorical. There seems no reason to doubt that Serber took it to be a question to which Gell-Mann did not already know the answer (although it would hardly have taken him a couple of minutes to figure out). Some years after the event, Gell-Mann told Zweig that ‘Serber hadn’t told him anything he did not already know’ [59], in which case it was generous of him to acknowledge Serber for stimulating his ideas. It would be surprising if Murray had not already thought of using the triplet representation (about which Dalitz quizzed him in 1961), but given his strong support for the bootstrap philosophy and the idea of nuclear democracy, he would presumably have quickly dismissed it.

In any case, Serber’s memoirs continue:

‘A day or two later, it occurred to me that while the quarks’ fractional charges were strange, the magnetic moments would not be. The magnetic moments depend on the ratio of charge to mass. In the nucleon the quark would have an effective mass one-third of the nucleon mass, so the one thirds would cancel out in the ratio and the quark would have integral nuclear magnetic moments. A simple calculation gave the result that the proton would have three nuclear magnetic moments and the neutron would have minus two, values quite close to the observed ones. That convinced me of the correctness of the quark theory. At that point, I should have published; but I never got round to it. Bacqui Beg suggested to me that the reason was that the idea seemed so obvious to me that I thought it must be familiar to the experts in the field. However, it was news to Murray, and sometime later he told Marvin Goldberger that he had never thought of it.’

This is the first recorded use of the quark model as more than a mnemonic or source from which to abstract algebraic relations. While his results were ‘quite close to the observed ones’, the fact that (without reference to masses) the model predicts that the ratio of the magnetic moments is −1.5, in good agreement with the observed value of −1.46, is perhaps more impressive (this result which was later derived from SU(6) symmetry by Beg et al. [8], and then by Becchi and Morpurgo [7] using the quark model, without the need for SU(6)).

Gell-Mann’s Physics Letter on quarks [44] was received on 4 January 1964. Paul Frampton has informed me that when he was visiting Cal Tech some years later, Helen Tuck, Murray’s Personal Assistant, told him that Murray first submitted the paper to Physical Review Letters (which would have been his normal practice) but it was rejected—which made him furious. When it was submitted to Physics Letters, Jacques Prentki at CERN was the editor. Torleif Ericson, another CERN physicist, has told me (private communication, December 2022) that one day he observed a crowd spilling out of Prentki’s office into the corridor. They were discussing what Jacques should do with Murray’s paper, which had got thumbs down from referees. Torleif recalls that finally Jacques, ‘in his inimitable English accent’, said ‘Murray is a grown up with a reputation to lose. Everybody knows that with 1/3 charges you can get this. If he wants to make a fool of himself, I will let him. So I accept his article.’ Having known Prentki, I am not surprised that he had already considered using the fundamental representation and knew that it would require non-integral charges, but I doubt more than a handful of others had thought about it.

In his Physics Letter, Gell-Mann proposed that algebraic relations should be abstracted from a ‘formal field theory model’ of quarks and used as a constraint on ‘bootstrap’ models. He only considered the possibility that quarks might be real particles in the concluding paragraph, in which he wrote that ‘It is fun to speculate about the way quarks would behave if they were physical particles of finite mass (instead of purely mathematical entities as they would be in the limit of infinite mass)’, pointed out that one would be stable, and concluded that ‘A search for stable quarks…at the highest energy accelerator would help to reassure us of the non-existence of real quarks’.

André Petermann’s paper [80] ‘Properties of Strangeness and a Mass Formula for Vector Mesons’, written in French, was received by Nuclear Physics on 30 December 1963, but it was not published until March 1965 and went almost unnoticed for fifty-five years. It is based on the idea that hadrons are all composed of three constituents and that, adopting Gell-Mann’s nomenclature, the strange quark is heavier than the non-strange quarks. He used this idea to interpret the Gell-Mann Okubo SU(3)-based relation [42, 72] between baryon masses, went on to derive a relation between the masses of the vector mesons, and then used the quark mass difference he inferred from baryon masses to calculate mass differences between vector mesons. Towards the end of the paper, he wrote of the constituents he proposed if one wants to keep charge conservation, which is highly desirable, the particles must then have non-integral charges. This is unpleasant, but cannot, after all, be excluded on physical grounds’. This is the quark model.

His formula relating the masses of the φ, K* and ρ mesons had actually been derived earlier by Okubo [73] on the basis of SU(3) symmetry alone, but Petermann (followed closely by Zweig) was the first to publish an interpretation of mass formulae in terms of constituents and relate meson mass differences to baryon mass differences. It might be suspected that publication of his paper was delayed by referees, but there is no evidence of resubmission or of any revision. Another possible explanation is that he had not returned the proofs, which is plausible in the opinions of those who knew him, and is known to be the reason for a seven-year delay between submission and publication of another of his papers (A De Rujula, private communication).

Petermann (or Peterman as he spelled his name in most of his other publications), who was a recluse, did not follow up his paper, and it seems he only drew attention to it once, when asking Alvaro De Rújula to be his scientific executor.Footnote 4 It was only referenced once, in a 1975 paper on a quark search at the CERN Intersecting Storage Rings, before Alvaro referred to it in 2004 [30], publicised it at the end of a talk by Zweig in the CERN Auditorium in September 2012, and in 2014 published a note about it [31]. This is not surprising: the paper was in French, the title did not reflect its contents, and Zweig’s much more comprehensive treatment had rendered it redundant by the time of its delayed publication.

George Zweig proposed quarks (or aces as he called them), independently of Gell-Mann and of Petermann, while visiting CERN. His work is reported in a 26-page preprint [95] dated 17 January 1964, which was replaced by an 80-page version dated 24 February 1964 [96], and followed up in lectures at Erice [97]. Zweig has written fascinating accounts of his work and its reception [98,99,100,101] by Feynman, Gell-Mann and others, and (as described below) explained why his preprints were never published.Footnote 5

While at CERN, Zweig was supported by grants which paid an overhead to CERN, and provided $1,200 for publication costs. He wanted to publish his work in the Physical Review, but was prevented by the Head of the Theory Division, Leon van Hove (who later, as joint Director General of CERN, played a key role in promoting construction of the proton anti-proton collider). Van Hove told him that outputs from the CERN Theory Division had to be published in European journals, and instructed the theory secretariat not to type any of his papers (this was a real problem for Zweig who could not type and did not have a typewriter, although the late and widely lamented Tanya Fabergé, who for decades ran the Theory Division Secretariat, disobeyed instructions and typed his second preprint). He was scheduled to give a seminar, titled ‘Dealer’s choice: Aces are Wild’, but van Hove took down the announcement and told him ‘You are not allowed to speak at CERN’. To this day, Zweig does not know why he encountered such animus, but suspects it stemmed from his insistence on publishing in the Physical Review, which went counter to van Hove’s efforts to promote European physics.

Zweig was led to discover quarks by the data: in accounts of his work he insists that it was a discovery not an invention. The cornucopia of impressive results in his preprints include the famous Zweig rule, which provided an explanation for the very surprising fact that φ mesons decay much more frequently into two K mesons than into a ρ and a pi meson. In the quark model, the latter decay involves the annihilation of the constituents of the φ whereas in the former they are rearranged, which Zweig argued would be favoured dynamically. His papers not only laid the foundations of the use of the quark model to describe the properties of hadrons, but foresaw [96] that ‘high momentum transfer experiments may be necessary to detect aces’—which the SLAC deep inelastic scattering experiments later did.

4 Reactions and objections

When Zweig returned from CERN to Caltech, where he had been Feynman’s student, he told Feynman and Gell-Mann about his work. He recalls [101] that Feynman disliked the Zweig rule and espoused the bootstrap view that in the correct theory of strong interactions it would not be possible to say which particles are elementary, while Gell-Mann’s reaction was ‘Oh, the concrete quark model. That’s for blockheads’. Zweig, who is today based at MIT, started to make a transition to neurobiology in 1969: had he remained in the field, he might have received more of the recognition he deserves.

Those who worked on ‘concrete quarks’ in the 1960s thought that they had not been observed because they are very heavy (5 GeV or more), which led to many quark searches, as advocated—in different spirits—by both Gell-Mann and Zweig. I remember Viki Weisskopf, who was then the Director General of CERN, citing the search for quarks as a reason for building the CERN Intersecting Storage Rings in a lecture in Oxford in 1964. In a model with very heavy but tightly bound quarks, the binding energies and hence quark wave functions of (e.g.) a pi meson and the much heavier—but almost equally tightly bound—K meson would be very similar. This resolved the puzzle of how, as required by SU(3) symmetry, particles with such different masses could have otherwise very similar properties. Furthermore, Morpurgo pointed out that, although very tightly bound, quarks could move non-relativistically inside light mesons and baryons, as assumed by quark modellers [69]. This removed one objection that had been raised, but the question of how tightly bound quarks could behave effectively as free particles was not really answered until the advent of QCD.

A far more serious objection was that since quarks must have half integral spins then, according to the fundamental ‘spin-statistics’ theorem, their wave functions should be anti-symmetric under the exchange of all their labels, or in the usual jargon: they should be Fermions, not Bosons. The three quarks that form the ground states of the baryons are symmetric under the interchange of their quark and spin labels, and the theorem therefore required their space wave functions to be anti-symmetric. Anti-symmetry implies spatial variations that lead to large internal kinetic energies, and would normally only be expected for excited states. While not ruled out in principle, an anti-symmetric ground state space wave function would require a bizarre form for the inter-quark force.

In his preprints Zweig treated quarks as bosons, without comment. He has told me that he was aware of the problem, but assumed that—as the model otherwise worked so well—an explanation would eventually be found. The problem was not mentioned by Petermann or Serber, or by Gell-Mann although he was aware of it,Footnote 6 as others must have been, and he told me many years later that it was his main objection to concrete quarks, as he has written [48].

In fact, Greenberg [50] soon pointed out that the ground state space wave function could be symmetric if quarks obey ‘para-statistics of order three’ rather than Fermi statistics. This suggestion, which relied on a very unfamiliar and seemingly abstract idea, was too radical for most people, and neither it nor Han and Nambu’s related but different model [55], in which three quarks are also replaced by nine, with integral charges, found much favour. Greenberg’s proposal is equivalent to endowing quarks with a new three-valued internal label in which the wave functions of baryons are anti-symmetric, allowing the spatial ground state wave functions to be symmetric.Footnote 7 This label is now called colour and this proposal is known to be the correct. However, sceptics were not convinced at the time, and for many years few theorists took the quark model seriously.

5 Hadron spectroscopy

One person who did take concrete quarks seriously was Dick Dalitz [3], who became the leading proponent of the ‘naïve’ quark model, which he first discussed in his influential 1965 Les Houches summer school lectures [27]. In these lectures he gave the first detailed description of the expected spectrum of baryons in which one of the quarks carries one or two units of orbital angular momentum. This spectrum successfully accommodates the particles that were then known and the many others discovered later.Footnote 8

While it was not possible to ignore what Dalitz in September 1965 called the ‘parallelism … between the data and the simple quark model’ [28], it seemed possible, as he conceded, that it ‘might simply reflect the existence of general relationships which would also hold in a more sophisticated and complicated theory of elementary particle stuff’. This hope was strongly encouraged by the discovery, in 1964, that some of the results of the quark model, including for example the ratio between the magnetic moments of the proton and neutron, could be derived by assuming that the underlying laws are unchanged under the simultaneous exchange of the labels that characterise particles’ spins and their SU(3) labels, which—it was proposed—should be combined in the symmetry group SU(6) [54, 85]. This kicked started a well-publicised race to find a ‘final’ theory that incorporated SU(6) in a larger symmetry group that respects Einstein’s theory of relativity. A symmetry group called U-twiddle-twelve was one of the candidates that was proposed, which led Gell-Mann to describe the whole enterprise as ‘twiddle-twaddle’. And so it proved when it was shown that no symmetry can combine internal and space–time degrees of freedom in a non-trivial wayFootnote 9 (except, as was discovered many years later, supersymmetries, which relate bosons to fermions). As Zweig [102] recently said of SU(6), which cannot be a real symmetry, and SU(3), which is exact in the limit of equal light quark masses if electromagnetic interactions are ignored, ‘they must be outputs, not inputs, of a field theory for aces with different masses and spin dependent forces’.

It was by no means obvious that the quark model would be able to accommodate the plethora of new particles then being discovered. The ‘discovery’, in a missing mass experiment at CERN, that the A2 meson is actually two states could have killed the model, which only has a place for one. The group first reported a narrow dip in the centre of the A2 peak in 1965, and in 1967, following improvements in the apparatus, claimed that it had a statistical significance of six standard deviations. Several other experiments observed a split, albeit with much smaller significance (≤ 3 σ). Eventually the effect died, although it still had some life in it as late as 1970 [86], and the quark model survived this and other potential set-backs. Its successes were much appreciated by experimentalists, but theorists’ reactions were at best lukewarm if not antagonistic.Footnote 10

In his introductory talk at the 1966 Berkeley Conference on High-Energy Physics, Gell-Mann had this to say [46] of ‘three hypothetical and probably fictitious quarks’:

‘It’s hard to see how deeply bound states of such heavy real quarks could look like \(\overline{\mathrm{q} }\) q, say, rather than a terrible mixture of \(\overline{\mathrm{q} }\) q \(\overline{\mathrm{q} }\) q \(\overline{\mathrm{q} }\) q and so on …the idea that mesons and baryons are made primarily of quarks is difficult to believe, since we know that in the sense of dispersion theory, they are mostly, if not entirely, made up of each other. The probability that a meson consists of a real quark and an anti-quark pair rather than two mesons or a baryon and an antibaryon must be quite small. Thus it seems that whether or not real quarks exist, the q and \(\overline{\mathrm{q} }\) we have been talking about are mathematical entities.’ Gell-Mann had clearly not accepted quarks as fundamental degrees of freedom in a field theory that describes all hadrons.

Later in the meeting he walked out when Dalitz, as rapporteur on Strong Interactions and Symmetries, based most of his talk [29] on the quark model with anti-symmetric ground state space wave functions for baryons (which he abandoned the next year). Dalitz noted that we faced ‘an unfamiliar situation, but one which has much qualitative correspondence with the experimental data… The [hadrons] should be regarded as rather analogous to molecules whose constituent atoms are quarks. Such a … model appears especially unfamiliar in terms of the conventional ideas of field theory today … if it works well, then it will be the task of field theory to show how such a model can arise from … some field theory.’

After his talk, Maglić (who had ‘discovered’ the split A2) asked Dalitz ‘Does this model rely on the existence of physical quarks?’. He replied ‘… if there do not exist real particles, this model has no interest’. Whether or not confined quarks are ‘real particles’ can be debated, but a field theory (QCD) was eventually discovered that provides a basis for the naïve quark model.

6 Deep inelastic electron scattering

In the same conference, in the discussion after his talk on Electromagnetic Interactions, Sid Drell (the Deputy Director of SLAC) said [32] that he would ‘very much like to see inelastic electron or muon cross sections measuredFootnote 11…Also there are some sum rules, asymptotic statements derived by Bjorken and others, as to how these inelastic cross sections behave in energy, … which can be checked experimentally’. This wish was fulfilled in September 1968 when Jerry Friedman, speaking on behalf of the Friedman-Kendal-Taylor group, presented the first results of the classic SLAC-MIT ‘deep inelastic’ electron scattering experiments at the International Conference on High-Energy Physics in Vienna (Friedman has given an account of the work of this group[38]). He reported that the scattering cross sections were much larger than generally expected, and that—to first approximation—the dimensionless ‘structure functions’ that characterise the cross-section depend only on the dimensionless variable ν/q2 (which is defined below). The group realised that their results were suggestive of scattering from point-like objects, but decided by a vote that, against his wishes, Jerry should not say so (Friedman, private communication 2019). Wolfgang (‘Pief’) Panofsky, the Director of SLAC, was not aware of the vote, and when, speaking as rapporteur [74] said that ‘theoretical speculations are focused on the possibility that these data might give evidence on the behaviour of point-like, charged structures within the nucleon … The apparent success of the parametrization of the cross sections in the variable ν/q2 in addition to the large cross section itself is at least indicative that point-like interactions are becoming involved.’

The SLAC results surprised almost everyone, but there were many further twists and turns before the interpretation suggested by Panofsky was generally accepted and a consensus finally emerged that, together with complementary measurements of neutrino scattering at CERN, the experiments proved the existence of quarks. One person who was not surprised was James (Jim) Bjorken—universally known as bj. Already in 1966 he had inferred, from a sum rule that he ‘derived’ (the reason for the quotes will be explained later) for electron scattering from polarised targets [12], [13], that inelastic scattering must be ‘comparable to scattering off point-like charges’. Bj went on to show, using the same techniques, that at high energy E the total cross section for electron–positron annihilation to hadrons should vary as 1/E2, like that for annihilation into a muon and an anti-muon, and that the total cross section for neutrino scattering on protons and neutrons would be proportional to E.

Then in his 1967 Varenna summer school lectures [14] and in his talk at the 1967 SLAC conference Bjorken [15] discussed the sum rule derived by Adler (from Gell-Mann’s algebra of currents) for the difference between neutrino and anti-neutrino scattering on protons and neutrons, noting that This result would also be true were the nucleon a point-like object, because the derivation is a general derivation. Therefore the difference of these two cross sections is a point-like cross section, and it is big.’ He then provided the following physical picture:

‘We assume that the nucleon is built of some kind of point-like constituents which could be seen if you could really look at it instantaneously in time ... If we go to very large energy and large q2... we can expect that the scattering will be incoherent from these point-like constituents. Suppose ... these point-like constituents had isospin one-half ... what the sum rule says is simply [N ↑] − [N ↓] = 1 for any configuration of constituents in the proton.Footnote 12 This gives a very simple-minded picture of this process which may look a little better if you really look at it, say, in the center-of-mass of the lepton and the incoming photon. In this frame the proton is ... contracted into a very thin pancake and the lepton scatters essentially instantaneously in time from it in the high energy limit. Furthermore the proper motion of any of the constituents inside the hadron is slowed down by time dilation. Provided one doesn’t observe too carefully the final energy of the lepton to avoid trouble with the uncertainty principle, this process looks qualitatively like a good measurement of the instantaneous distribution of matter or charge inside the nucleon’

This is the physical picture that underlies the parton model, although the name was provided later by Feynman who developed it independently, as a basis for understanding proton-proton scattering[36].

The SLAC experiments measured the total cross section for scattering ‘virtual’ photons from a proton or neutron. If the target is not polarised, the cross section can be expressed in terms of two dimensionless ‘structure’ functions. They depend on the energy and momentum that the virtual photon transfers from the incoming electron to the target, which can be combined to form the relativistic four component momentum vector, known as q. The structure functions depend on the mass of the virtual photon (q2) and a dimensionless variable x = q2/2ν, where ν is the four-dimensional scalar product q.p, p being the four momentum of the target, in whose rest frame q.p is equal to the energy transfer times the mass of the target. Viewed in a frame of reference in which the target proton or neutron is moving rapidly, x can be interpreted as the fraction of its momentum carried by the quark that is struck by the virtual photon, as pointed out by Feynman.

In 1968 (published 1969), Bjorken ‘derived’ the result that as q2 → ∞ these structure functions become non-vanishing functions of x alone [16]. This result, known as Bjorken scaling, would necessarily be true if no mass scales played a role at large q2, as implied by the original ‘naïve’ parton model. ‘Derived’ is in quotes because the methodsFootnote 13 that bj and others employed were soon found to be invalid in perturbation theory [2, 58]. In field theories, the scale (μ) at which the coupling constant is defined plays a role, and scaling is violated by powers of log(q22), although scaling and other results derived by the methods used by bj were later shown to survive to leading order in QCD.

At the end of his paper on scaling, bj wrote that ‘a more physical interpretation of what is going on is, without question, needed’! Having had partons without explicit scaling, he then had scaling without partons: while surprising with hindsight, this reflected the theoretical uncertainty that then prevailed. The data were soon found to exhibit approximate scaling, as Friedman and Panofsky reported in their Vienna conference talks in September 1968.

Meanwhile, in the words taken from a slide made by Marty Breidenbach [19], who as an MIT graduate student participated in the SLAC deep inelastic experiments:

  • Many of us did not understand bj’s current algebra motivation for scaling

  • Feynman visited SLAC in August 1968. He had been working on hadron-hadron interactions with point like constituents called partons. We showed him the early data on the weak q2 dependence and scaling—and he (after a night in a local dive bar) explained the data with his parton model.

  • In an infinite momentum frame, the point like partons were slowed, and the virtual photon is simply absorbed by one parton without interactions with the other partons—the impulse approximation.

  • This was a wonderful, understandable model for us.

It seems that bj’s 1967 use of the impulse approximation had not been taken on board by his experimental colleagues, and that it was Feynman who provided the connection between the physical picture they both developed and Bjorken scaling, and the parton interpretation of the scaling variable x as the fraction of the parent nucleon’s momentum that is carried by the parton that is struck by the virtual photon. In his own later account Bjorken [17], referring to Feynman’s visit to SLAC, wrote that the period 1966–71 was divided into ‘BF (Before Feynman) and AF (After Feynman)’, adding disarmingly that ‘the way he (Feynman) described the infinite-momentum constituent picture so familiar now was somewhat foreign, and seemingly naive. Retrospectively, there was nothing naive about it. I was hampered by my own flawed version of the constituent viewpoint, where for half of the argument I would use infinite-momentum thinking, and for the other half retreat to the proton rest frame’.

In 1969, Bjorken and Paschos [10, 11] constructed the first explicit quark-parton model. They wrote that ‘the important feature of this model, as developed by Feynman, is its use of the infinite-momentum frame of reference’ and thanked Feynman for discussions, with no reference to bj’s 1967 papers in which he already employed the infinite-momentum frame! In the same year Callan and Gross [21] showed, using formal methods, which later turned out to be correct to leading order in QCD, that the ratio of deep inelastic cross sections for scattering of longitudinal and transverse virtual photons is zero in models in which currents are built of spin 1/2 objects, such as quarks, and infinity in models in which they are built of bosonic fields. The first measurements, published in September 1970, found a ratio of 0.2 ± 0.2, which was encouraging for supporters of the quark model.

7 Neutrino as well as electron scattering

Very soon after the presentation of SLAC’s first deep inelastic electron data, Don Perkins and his colleagues realised that the neutrino data they had obtained in, much less precise and lower energy heavy liquid bubble chamber neutrino experiments at CERN in 1963–67, were consistent with the point-like behaviour observed at SLAC. Don reported this in a conference on Weak Interactions at CERN in 1969 [75], and he and Myatt[70] subsequently published a more a detailed analysis of the data.Footnote 14 The more precise results obtained with the much larger Gargamelle bubble chamber, which came into operation in 1972, are discussed later.

Meanwhile, early in 1969, David Gross gave a talk at CERN on theoretical approaches to deep inelastic scattering. I was then working on nuclear effects in neutrino scattering on nuclei with John Bell, of inequality fame, and was interested in neutrinos. To test my understanding of David’s talk, I applied the parton model to deep inelastic neutrino scattering and asked him if my results were right. It turned out nobody had done this (although I believe bj was doing it). Using formal methods, found later to be valid to lowest order in QCD, we found a sum rule that measures the difference between the number of baryons and anti-baryons in the nucleons[52], which clearly provides a critical test for the quark model. We were aware that the methods we used fail in perturbation theory, but blithely remarked that there was no reason to believe that field theory is relevant, because—since the SLAC experiments seemed to exhibit Bjorken scaling—it is contradicted by experiment!

In 1970 it was found at SLAC that the structure functions of neutrons and protons differ, in a way that is consistent with the quark model. This ruled out some alternatives, but others were more resilient. For example, I had found [64] that according to the quark-parton model the ratio of electron and neutrino structure functions is 5/18 if the contributions of strange quark-anti-quark pairs are neglected. This looks like, and is, a good way to test the model and measure quark charges, but it was also predicted by ‘generalised vector dominance’ models of deep inelastic scattering, and follows from assuming that the ratio of scattering of isospin 0 and isospin 1 virtual photons is 1/9, a result that was known to be approximately true for real photons. In the same paper, I pointed out that the area under the proton structure function would be 1/3 in a model with just three quarks and greater than 2/9 if a uniform sea of quark-anti-quark pairs was added (as in the Bjorken-Paschos model). This was hard to reconcile with the value 0.18 measured at SLAC, but (I wrote) could ‘easily be reduced by adding a background of neutral constituents (which could be responsible for binding quarks). No long after, I ‘derived’ a sum rule [65, 66] that provided a way to measure the fraction of a nucleon’s momentum that is carried by such neutral gluons, by combining electron and neutrino scattering data, which (according to the limited neutrino data available in 1971) turned out to be greater than or equal to 0.52 ± 0.38.

8 Quarks come out of the closet

Use of the quark model was by then gradually becoming ‘politically correct’. Its foundations were strengthened when it was found in 1969 that, thanks to the so-called chiral anomaly [1, 9], the rate at which the π0 meson decays into two photons can be calculated exactly and that the quark model gives the right result provided quarks have three colours. It was encouraging that Feynman, arriving late at the party, came out in support of the model as a co-author of a 1971 paper [37] that inter alia used it to derive (actually re-derive [25, 26]) results related to photo-production.

Nevertheless, many theorists remained sceptical. Gell-Mann’s later repeated claims (made particularly strongly in 1997 [48]) that when he had referred to quarks as ‘mathematical entities’ he meant that they were confined is hard to reconcile with his statements at this time (and earlier—see quotations collected by Zweig [98,99,100], [101]). In a talk at the 1971 Coral Gables Conference on work done with Fritzsch [47], he stated that the results that had been ‘derived’ for deep inelastic scattering ‘… are easy to accept if we draw our intuition from certain [quark] field theories with naïve manipulation of operators. However, detailed calculations using the renormalized perturbation theory expansions in renormalizable field theories do not reveal any of these sorts of behavior… If we accept the conclusions, therefore, we should probably not think in terms of perturbation expansion, but conclude, so to speak, that Nature reads books on free field theory as far as the Bjorken limit is concerned.’

While the final phrase provides a possible escape route, quarks cannot be real but confined (by what?) in a free field theory. Gell-Mann was not yet ready to entertain the idea that gluons are also needed, as he said explicitly when he spokeFootnote 15 about his work with Fritzsch at SLAC in early 1971.

9 Advent of QCD

By 1972 Gell-Mann’s position had changed. He and Fritzsch [39] discussed abstracting enough information about colour singlet operators from the quark-vector gluon model to describe all the degrees of freedom that are present. They went on to say ‘Now the interesting question has been raised lately whether we should regard the gluons as well as the quarks as being non-singlets with respect to colour (ref: J Wess, private communication to B Zumino). For example, they could form a colour octet of neutral vector fields obeying the Yang-Mills equations’, which is of course the case in QCD.

In fact, already in 1966, Nambu had suggested [71] that the additional SU(3) symmetry in the Han-Nambu three-triplet model should be coupled to eight non-Abelian gauge fields. In the same year, Greenberg and Zwanziger pointed out[51] that in non-relativistic three-triplet and three para-quark models, it would be natural for the lowest lying baryons to contain exactly three triplets. This work was largely forgotten when the three-triplet model and references to para-quarks fell out of fashion, but it was followed up in 1973 by Lipkin who noted [60] that ‘interactions of the type produced by the exchange of gauge vectors bosons classified in an octet of the SU(3) group, sometimes called color’ would promote coloured states to higher energy than colour singlets in a non-relativistic model. Finally, in 1973, Fritzsch, Gell-Mann and Leutwyler wrote their celebrated paper [40] on QCD, which—together with the discovery that such theories are asymptotically free [81], [53]—completed the theoretical foundations of the quark model.

In QCD the powers of log(q22) that violate Bjorken scaling in all but the lowest order of perturbation theory, combine to turn the strong coupling ‘constant’ into a function of q2, which vanishes like 1/log(q22) at large q2. This gives rise to (calculable) scaling violations, which were first observed two years later at SLAC. The physical picture (due to Ken Wilson) is that the resolution with which the virtual photon probes the structure of the nucleon increases with q2, and what looked like a quark at one scale, may be found to consist of a quark plus a gluon, or a quark plus a gluon and a quark-anti-quark pair, etc. as the resolution increases. This shifts the observed quark momentum spectrum to lower x as q2 increases and more constituents come into play. At low q2, the strong coupling ‘constant’ becomes very (perhaps infinitely) large, and it can be convincingly argued (if not rigorously proved) that only ‘colour singlets’—in the case of baryons, states that are anti-symmetric in the colour variable—can exit as free particles, while quarks should be forever confined.

In parallel, experimental support for the quark-parton picture was growing. In particular, in 1972, preliminary neutrino data from Gargamelle, provided ‘an astonishing verification of the Gell-Mann/Zweig quark model of hadrons’ in words used by Perkins in a review talk at the International Conference on High Energy [77]. By the time of the 1973 Hawaii Conference, at which I was one of four lecturers together with Dick Feynman, Don Perkins and Douglas Morrison, anti-neutrino data had become available from Gargamelle. Together with the neutrino data and the SLAC electron data, they provided good evidence that protons and neutrons are composed of point-like spin 1/2 particles with third -integral charges and baryon number one third (i.e. quarks) and neutral gluons [Perkins 1973].

But the ‘discovery’, later shown to be erroneous, of the so-called high y anomaly in one of the National Accelerator Laboratory (now known as Fermilab) neutrino experiments, which suggested that the quark picture might be wrong or that new constituents would have to be invoked, cast a shadow. It turned into an ominous dark cloud in 1973 when measurements of electron–positron annihilation into hadrons at the Cambridge Electron Accelerator[61] at an energy of 4 GeV found a cross session that was much larger than expected. A further measurement [91] at 5 GeV together with data at other energies with much smaller errors from the SPEAR collider at SLAC (presented for the first time by Burt Richter [Richter, 1964] in the 1975 International Conference on High-Energy Physics) showed that, rather than falling like E−2 as predicted by Bjorken in 1966, the cross section was constant. This major puzzle attracted huge attention and stimulated a lot of speculation, as discussed by Richter and Ellis[33, 83].

The cloud went away when the J/Ψ particle, discovered at Brookhaven (J) and by SPEAR (Ψ) in November 1974, was correctly interpreted as a bound state of a charm and anti-charm quark. In early 1975 a Brookhaven bubble chamber experiment reported the observation of one event [22] which ‘with the caveat associated with one event’, was ‘strongly indicative of charmed baryon production’, and in 1976 several charmed mesons were found at SPEAR [49, 79]. The ‘constant’ electron–positron annihilation cross-section was understood to be nothing more than a pause in the fall with energy as the threshold for charm particle production was crossed, and the annihilation rate changed from that expected with three flavours of quarks, each with three colours, to that expected with four flavours. From that point onwards, the quark model was finally almost universally accepted.

10 Concluding remarks

In the following years QCD also became accepted as evidence for the expected scaling violations accumulated, the three-jet (quark + anti-quark + gluon) events anticipated [34] in electron–positron annihilation were found in the TASSO experiment [92] at DESY (Söding has reviewed the history of this discovery [89]), providing direct evidence for the existence of gluons, and ‘perturbative QCD’ was successfully applied to many high-energy processes. In parallel, more quarks were discovered, and—ironically—it was realised that the SU(3) symmetry that started it all simply reflects the fact that three quarks are much lighter than the others and has no deep significance.

I have written this history to try to ensure that the role of Petermann is not overlooked, that Zweig and Bjorken get the recognition they deserve, and to attempt to clarify the role of Serber. I also wanted to recall the contemporary atmosphere which led most theorists to reject the quark model in the 1960s, despite the fact that it explained much of the data. What the British call the Whig view of history, which presents the past as an inevitable progression towards a more enlightened future, may be a useful pedagogical device in scientific text books, but the history of physics involves misunderstanding and confusions. In science, the judgement of experiment ensures that greater enlightenment eventually emerges, although—as in the case of the quark model—it can be a slow process.

As for Gell-Mann, he had very good reasons for initially insisting that quarks are mathematical entities, although in retrospect it is surprising that he stuck to his guns for so long. In any case, although his opposition to the naïve quark model was unsettling for its proponents, I doubt that it held back its acceptance significantly. If not the sole father, he was certainly the grandfather of the quark model, and as a proposer of QCD was involved in the final theoretical chapter. His many other achievements were stupendous: adapting Shakespeare, he bestrode mid-twentieth century theoretical particle physics like a colossus.