A 750 GeV Diphoton Signal from a Very Light Pseudoscalar in the NMSSM

The excess of events in the diphoton final state near 750 GeV observed by ATLAS and CMS can be explained within the NMSSM near the R-symmetry limit. Both scalars beyond the Standard Model Higgs boson have masses near 750 GeV, mix strongly, and share sizeable production cross sections in association with b-quarks as well as branching fractions into a pair of very light pseudoscalars. Pseudoscalars with a mass of ~ 210 MeV decay into collimated diphotons, whereas pseudoscalars with a mass of ~ 500-550 MeV can decay either into collimated diphotons or into three pi^0 resulting in collimated photon jets. Various such scenarios are discussed; the dominant constraints on the latter scenario originate from bounds on radiative Upsilon decays, but they allow for a signal cross section up to 6.7 fb times the acceptance for collimated multiphotons to pass as a single photon.


Introduction
In December 2015 the ATLAS and CMS collaborations have reported excesses in the search for resonances decaying into pairs of photons for diphoton invariant masses around 750 GeV [1,2]. In ATLAS, excesses appeared in the two M γγ bins 710-750 GeV (14 events vs. 6.3 expected) and 750-790 GeV (9 events vs. 5.0 expected), with a local significance of 3.9 σ (assuming a large width of ∼ 45 GeV; 3.6 σ in the narrow width approximation). In CMS, excesses appear in the M γγ bin 750-770 GeV for photons in the EBEB category (5 events vs. 1.9 expected) and EBEE category (6 events vs. 3.5 expected), but less in the bin 730-750 GeV (4 events vs. 2.1 expected for photons in the EBEB category, 1 event vs. 4.0 expected for photons in the EBEE category, considered as less sensitive). The local significance of the excesses is 2.6 σ for CMS in the narrow width approximation.
The global significances of the signals of O(2 − 3 σ) are not overwhelming and compatible with statistical fluctuations. Still, the fact that the region of invariant diphoton masses is very similar for ATLAS and CMS has stirred quite some excitement resulting in a huge number of possible explanations. (The number of proposed models exceeds the number of observed signal events.) Fits to the combined data should, in principle, also consider the informations from diphoton searches at 8 TeV [3,4] where a mild excess was observed by CMS. However, the extrapolation of signal cross sections from 8 to 13 TeV depends on the assumed production mechanism [5][6][7][8][9]. Assuming the production of a resonance around 750 GeV by gluon fusion (ggF), combined fits to the signal cross sections at 13 TeV are in the range 2-10 fb [5][6][7]9], with slightly better fits and a larger signal cross section assuming a larger width of [30][31][32][33][34][35][36][37][38][39][40][41][42][43][44][45]6,9].
It is notoriously difficult to construct a consistent model for such a resonance "X": Its production channel in proton proton collisions is typically assumed to be ggF through loops of colored particles. If these are the quarks of the Standard Model (SM), X would decay into them leaving little branching fraction for X-decays into γγ, which has to be generated by loop diagrams as well.
Accordingly simple two Higgs doublet (or MSSM) extensions of the Standard Model, which could contain a resonance X near 750 GeV [10][11][12][13][14][15][16][17][18][19][20][21][22][23], require additional scalars or vector-like fermions whose loops generate the coupling of X to gluons and/or γγ (unless R-parity is broken [24,25]). Large Yukawa couplings are required for a sufficiently large cross section, which risk to generate new hierarchy problems/Landau singularities (unless compositeness is invoked). Also in the Nextto-Minimal supersymmetric extension of the Standard Model (NMSSM) it has been argued [26][27][28] that additional vector-like quark superfields have to be introduced. In [29] a two-step decay cascade involving the two pseudoscalars of the NMSSM with masses of about 750 GeV and 850 GeV has been proposed which requires, however, to tune the corresponding mixing angle close to 0.
A different approach towards an explanation of the diphoton events is to consider that a single photon in the detector can represent a collimated bunch of photons (typically two of them) which originate from a single very light state, for instance a light pseudoscalar A [5,[30][31][32][33][34][35][36][37]. Then the observed processes correspond to an initial resonance X decaying into a pair A A, where M A must be well below 1 GeV for the resulting photons to be sufficiently collimated (see below). This scenario opens the possibility to explain the diphoton events in different models which can accomodate resonances X and a light pseudoscalar A. In this paper we show that the simple Z 3 -invariant NMSSM belongs to this class of models. (This has also been observed in [38]. ) In the NMSSM (see [39,40] for reviews), two CP-even Higgs states beyond the Standard Modellike Higgs (subsequently denoted as H SM ) can play the role of a resonance X. In terms of weak eigenstates, a singlet-like state S can have a large coupling to a pair of mostly singlet-like pseudoscalars A 1 , originating from a cubic singlet self coupling κ in the superpotential (see below). How-ever, a coupling to quarks or gluons inside protons has to be induced by a mixing of S with one of the two SU(2) doublet-like Higgs states. If this state is H SM , the mixing reduces the couplings of H SM to SM particles (notably W ± and Z) and is severely constrained [9] by the measured signal rates by ATLAS and CMS [41]. An alternative is that S mixes strongly with the other "MSSM"like CP-even state H. Then the physical eigenstates -preferably both of them with masses near 750 GeV -can profit from an enhancement of the couplings of H to b-quarks by tan β, leading to sufficiently large signal cross sections into the A 1 A 1 (and hence diphoton) final state via associated production with b-quarks. Given the diphoton mass resolution of the detectors and the slightly preferred large width of the excess it is clear that two (narrow) CP-even states near 750 GeV, mixtures of H and S, can also provide a good fit to the data. (A similar scenario has been discussed in [42].) For one of the benchmark points presented below (BP1) the signal originates, however, from one CP-even state only, the other one being significantly heavier.
A light pseudoscalar can appear in the NMSSM in the form of a pseudo-Goldstone boson (PGB). A priori two global symmetries can lead to such PGBs: First, a Peccei-Quinn symmetry emerges in the limit κ → 0 [40,[43][44][45]. However, κ = 0 is required for the couplings of the heavy Higgs states to A 1 A 1 . Second, the scalar potential of the NMSSM is invariant under an R-symmetry [40,[46][47][48] if the soft supersymmetry breaking trilinear couplings A λ and A κ vanish, leading to a PGB due to its spontaneous breakdown by the phenomenologically required vacuum expectation values. We find indeed, that the interesting part of the parameter space of the NMSSM corresponds to small values of A λ and A κ . However, since the R-symmetry is broken by radiative corrections to the scalar potential involving the necessarily non-vanishing gaugino masses and trilinear couplings A t and A b , it helps only partially to explain a very light pseudoscalar A 1 . Still, it represents a "go-theorem" showing that a standard supersymmetric extension of the SM -without additional vector-like quarks and/or leptons -could explain the observed diphoton excess.
Different assumptions on the mass of A 1 can be made. For one set of scenarios we assume M A 1 ∼ 210 MeV, just below the 2 µ threshold. These scenarios lead to visibly displaced vertices from the A 1 → γγ decays. For a large value of the NMSSM trilinear coupling κ ∼ 1.65, the signal can originate from a single Higgs state near 750 GeV. For smaller values of κ, the signal can originate from two Higgs states with masses near 750 GeV. For another set of scenarios we assume M A 1 ∼ 510 − 550 MeV, not far from the η mass. For M A 1 near 550 MeV, A 1 mixes with the η meson and inherits its decays into γγ and 3π 0 ; the latter lead to photon-jets. The average separation in rapidity of the diphotons and the two leading photons from 3π 0 will be studied. For M A 1 near 510 MeV, constraints from searches for radiative Υ(1S) decays into γ + η by CLEO [49] are alleviated, but estimates of the A 1 decay widths are more uncertain. But in both cases the A 1 life time is short enough avoiding macroscopically displaced vertices, and two Higgs states near 750 GeV can generate a signal.
In the next section we describe with the help of analytic approximations to the mass matrices (including only the dominant radiative corrections) which region in the parameter space of the NMSSM can generate the diphoton events. In section 3 we discuss various constraints from low energy physics on light pseudoscalars, and discuss separately the different scenarios. Benchmark points are presented with the help of the public Fortran code NMSSMTools [50,51]. For the different A 1 masses we study the average separation in rapidity of the diphotons and the two leading photons from 3π 0 , which allows to estimate the corresponding acceptances. In the final section 4 we summarize and discuss possible alternative signatures, which could help to distinguish different scenarios if the excess survives the next runs of the LHC.

Parameter regions with diphoton-like events at 750 GeV in the NMSSM
We consider the CP-conserving Z 3 -invariant NMSSM. The superpotential of the Higgs sector reads in terms of hatted superfields Once the real component of the singlet superfieldŜ develops a vacuum expectation value (vev) s, the first term in W Higgs generates an effective µ term The soft SUSY-breaking terms consist of mass terms for the gaugino, Higgs and sfermion fields as well as trilinear interactions between the sfermion and the Higgs fields, including the singlet field The tree level scalar potential can be found in [40], from which the 3 × 3 mass matrices in the CP-even and CP-odd sectors can be obtained. Once the soft Higgs masses are expressed in terms of M Z , tan β and s using the minimization equations of the potential, the mass matrices depend on the six parameters λ, κ, tan β = v u v d , µ, A λ and A κ .
Initially, the CP-even mass matrix M 2 S is obtained in the basis of the real components (H d,r , H u,r , S r ) of the complex scalars (H d , H u , S) after expanding around the vevs v d , v u and s. It is convenient, however, to rotate M 2 S by an angle β in the doublet sector sector into M 2 S in the basis H SM , H , S r : where v 2 = 2M 2 Z /(g 2 1 + g 2 2 ) ∼ (174 GeV) 2 and is the mass squared of the MSSM-like CP-odd state A M SSM . ∆ rad denotes the dominant radiative corrections due to top/stop loops, where m 2 ST = m Q m T and X t = A t − µ/ tan β. As discussed in the introduction, we intend to describe the diphoton signal at ∼ 750 GeV by a mixture of the two states H and S r . Then, both diagonal matrix elements M 2 S,22 and M 2

S,33
should have values close to (750 GeV) 2 . Furthermore we will be interested in the R-symmetry limit A λ , A κ → 0. This implies the relations (for tan 2 β 1) and The matrix element inducing H − S r mixing is given by Next we turn to the CP-odd sector. The 3×3 CP-odd mass matrix contains always a Goldstone boson which will be eaten by the Z boson. The remaining CP-odd states are a singlet A S , and the "MSSM"-like SU(2)-doublet A M SSM . In the basis (A M SSM , A S ), in the R-symmetry limit A λ , A κ → 0, the CP-odd mass matrix is given by . (2.14) Obviously M 2 A has a vanishing eigenvalue M A 1 = 0, and is diagonalised by an angle α with (for tan 2 β 1) An important quantity will be the (reduced) coupling X d of A 1 to down quarks and leptons, which is obtained through the mixing of A S with A M SSM . Since the reduced coupling of the MSSM-like state A M SSM is given by tan β, one obtains Radiative corrections to the tree level potential and hence to the CP-odd mass matrix include terms proportional to the electroweak gaugino masses M 1 and M 2 , and terms proportional to the soft SUSY breaking trilinear couplings A t and A b . These corrections break the R-symmetry present for A λ , A κ → 0, which is expected since A λ , A κ = 0 is not invariant under scale transformations. Hence, depending on the scale where A λ , A κ = 0 is assumed to hold, A 1 is a pseudo-Goldstone boson with a mass of typically a few GeV. For A κ small, but = 0 one can obtain M A 1 ∼ 210 MeV or M A 1 ∼ 510 − 550 MeV as it will be assumed in the next section.
Finally we note that, for the parameter region considered below, the dominant contribution to the coupling of A 1 to scalars originates from the quartic coupling ∼ κ 2 |S| 4 → 2κ 2 S 2 r A 2 1 . After shifting S r by its vev s one obtains Next we observe that eqs. (2.11) and (2.16) allow to express κ in terms of X d : From (2.11) one finds 750 GeV ∼ 2κs = 2κ where (2.16) was used in the last step. Inserting v ∼ 174 GeV one obtains In the next section, for the scenarios with M A 1 ∼ 510 − 550 MeV, we will obtain upper bounds on X d from upper bounds for the BR(Υ(1S) → γη) from CLEO [49]. These will thus imply upper bounds on κ according to (2.19). On the other hand a large signal rate, generated by a mixture of the states H and S decaying into A 1 A 1 , requires g SA 1 A 1 to be as large as possible. Accordingly X d and κ should saturate corresponding upper bounds. If the 750 GeV signal is generated by a superposition of signals of two nearby physical states formed by the H −S r system, their mass splitting should not be too large, preferably of O(20 GeV). Then the matrix element M 2 S,23 given in (2.12) should be as small as possible. With κ already determined, this implies µ as small as possible, preferably close to the lower bound ∼ 100 GeV from the LEP lower bound on higgsino-like charginos. Then (2.16) requires that λ is relatively small. (Simultaneously, this avoids a strong push-down effect on the mass of the SM-like Higgs boson from H SM − S r mixing, which is induced by the matrix element M 2 S,13 given in (2.13).) Finally the condition (2.10) on M 2 A fixes tan β ≈ 15. The remaining NMSSM parameters in (2.5) are A λ and A κ . Both R-symmetry breaking parameters have an impact on the mass of the pseudo-Goldstone boson A 1 . We find that one can chose small values of A λ and A κ such that M A 1 assumes the desired value; due to radiative corrections to the scalar potential the precise value of A κ depends on the other R-symmetry breaking parameters Herewith all NMSSM parameters are nearly uniquely determined.

Viable scenarios with a light NMSSM pseudoscalar
As discussed in the introduction we will study scenarios with different values of the mass of a light pseudoscalar, denoted subsequently by M A 1 . Constraints on such a light NMSSM pseudoscalar with a mass below ∼ 1 GeV have been discussed previously in [52][53][54][55][56][57][58][59]. Strong constraints originate from the mediation of FCNCs. Assuming minimal flavour violation, flavour violating couplings of A 1 still originate from SUSY loops involving stops, sbottoms and charginos and depend on the corresponding masses and trilinear couplings like A t . These contribute notably to B-physics observables like B s → µ + µ − , ∆M d and ∆M s . We have implemented the computation of these and many more B-physics observables and some K-physics observables in the code NMSSMTools [50,51] following the update in [59] and checked that, for the scenario presented here, the constraints are satisfied due to the mostly singlet-like nature of A 1 and the relatively heavy SUSY spectrum.
For M A 1 near 210 MeV, additional strong constraints originate from rare flavour changing processes K ± → π ± e + e − . (In [53] it has been argued that the corresponding constraints exclude scenarios with M A 1 < ∼ 210 MeV, where the branching fraction of A 1 into e + e − is sizeable.) We have verified the assertion in [38] that, for suitable choices of soft SUSY breaking parameters, the coupling C A responsible for these processes (see [52]) can be arbitrarily small 1 . Light pseudoscalars have been searched for in radiative Υ(1S) decays by CLEO in [60]; these are also verified by NMSSMTools 4.9.0 and satisfied by the benchmark points given below.
Due to the mostly singlet-like nature of A 1 , its contributions to the muon anomalous magnetic moment are negligibly small. However, for tan β ∼ 15 and assuming relatively light slepton masses of 300 GeV, the scenarios below can reduce the discrepancy between the measured value and the Standard Model to an acceptable 2 σ level.
Further constraints stem from possible A 1 production in Z and H SM decays. The relevance of bounds on light pseudoscalars (or axion-like particles) from searches for Z → γγ at LEP (where a photon can correspond to a bunch of collimated photons) has been investigated in [58]. These bounds constrain the loop-induced coupling g ZAγ . This coupling is also constrained by the upper bound on BR(Z → ηγ) < 5.1 × 10 −5 [61]. We have checked that in our cases this coupling is about four orders of magnitude below the bounds derived from [58,61]. Searches for H SM → A 1 A 1 → 4γ have been undertaken by ATLAS using 4.9 fb −1 of integrated luminosity at 7 TeV c.m. energy in [62] for M A 1 < 400 MeV. One can assume that the corresponding upper bound on BR(H SM → A 1 A 1 ) × BR(A 1 → γγ) 2 < ∼ 6.6 × 10 −3 applies to our scenario as well, which leads to BR( should be respected. In our scenarios we require BR(H SM → A 1 A 1 ) < ∼ 5 × 10 −4 , hence these constraints are well satisfied. Notably this small branching fraction has no impact on the measured signal rates of H SM into the other Standard Model channels, which agree well with the Standard Model predictions.
Additional constraints depending on M A 1 will be discussed in the corresponding subsections below.

M A 1 near 210 MeV
For a light A 1 , too light for hadronic final states (M A 1 below 3M π ), the possible decays are into µ + µ − , e + e − and the loop induced decay into γγ. The couplings of A 1 to Standard Model fermions are obtained via mixing with A M SSM as discussed in eqs. (2.14) and (2.15) in the previous section, and lead to a reduced coupling of A 1 to leptons ∼ X d ∼ κ, see (2.19). These couplings determine also the partial width into γγ. For a sizeable branching fraction into γγ, the decay into µ + µ − must be kinematically forbidden. On the other hand, for M A 1 < 200 MeV the remaining decays into e + e − and γγ lead generically to a too small total width implying, for a boosted A 1 with an energy of about 375 GeV, a decay length larger than the size of the detectors (unless A 1 mixes strongly with π 0 as discussed in [38]). However, for M A 1 very close to 2m µ , the loop contribution of muons to the width Γ(A 1 → γγ) reaches a maximum. It is given by (neglecting all other contributions; see, e.g., [63]) . We find that, for M A 1 near or slightly above 210 MeV, the partial width Γ(A 1 → γγ) dominated by the muon contribution is large enough to dominate the width Γ(A 1 → e + e − ) leading to a BR(A 1 → γγ) ∼ 74%.
The total A 1 width depends then essentially on its reduced coupling to muons X d related to κ via (2.19). First we consider a scenario with a total width of ∼ 1.7 × 10 −13 GeV, leading to a decay length of A 1 for an energy of 375 GeV of about 2 m. Given that the distance of the EM calorimeter cells to the interaction point is larger than 1.3 m for the ATLAS and CMS detectors (depending on the angle η), one can estimate that somewhat more than 60% of all pseudoscalars decay before the EM calorimeter cells.
This scenario requires κ > ∼ 1.65, in which case κ runs into a Landau singularity at about 400 TeV where the NMSSM would require a UV completion (e.g. GMSB). Then a single Higgs state near 750 GeV is able to generate a visible signal. (The second Higgs state is heavier near 1 TeV and has a significantly smaller production cross section. A scenario where a single Higgs state near 750 GeV is responsible for the signal and another Higgs state is far below 750 GeV is not possible: Then the lighter state would generate a larger signal, which is excluded.) For a large enough production cross section of the state near 750 GeV from its coupling to b-quarks it must have a dominant H (MSSM-like) component. Still, for a large enough branching fraction into A 1 A 1 , the H − S mixing angle (2.12) in the heavy scalar Higgs sector must not be too small and, notably, the coupling g SA 1 A 1 in (2.17) must be large. Both of these conditions are satisfied for κ ∼ 1.65, which is required if a single state should generate a visible signal.
Suitable values of λ, tan β and µ for the desired masses and mixings are given by a benchmark point BP1 in Table 1. (Since the mass of the second heavy Higgs state is near 1 TeV and not near 750 GeV, these values deviate somewhat from the ones obtained in the previous section. Radiative corrections of O(κ 2 /4π 2 ∼ 0.07) can require corresponding readjustments of these values.) Since tan β is ∼ 10, the NMSSM-specific uplift of the Standard Model like Higgs mass at low tan β is not available. Then the Standard Model like Higgs mass of ∼ 125 GeV requires large radiative corrections as in the MSSM.
As stated above and discussed in [38], the squark masses and A t can be chosen such that flavour violating couplings of A 1 are suppressed. In order to generate simultaneously large enough radiative corrections to the Standard Model like Higgs mass, both parameters have to be relatively large in the multi-TeV range. Possible numerical values are also indicated in Table 1. The remaining NMSSM specific parameters A λ and A κ are chosen small, such that the BR(H SM → A 1 A 1 ) (depending somewhat on A λ ) is below 5 × 10 −4 , and M A 1 sufficiently close to 2m µ such that the total width of A 1 is large enough, i.e. that its decay length l at 375 GeV is small enough: For the BP1 in Table 1 Couplings  This signal rate remains to be multiplied by an acceptance Acc(γ) for the diphotons to simulate a single photon in the detector. This issue will be discussed for all scenarios in section 3.3; for the time being the signal rates appear with a factor Acc(γ) in Table 2.
If we assume a slightly smaller value of M A 1 ∼ 210.5 MeV, Γ tot (A 1 ) decreases to ∼ 1.65 × 10 −13 GeV leading to l ∼ 2.2 m, reducing the percentage of decays before the EM calorimeters to ∼ 60% and hence the signal rate by ∼ 10%.
Scenarios with smaller values of κ are also possible. Then, however, the reduced coupling X d of A 1 to leptons is smaller (see (2.19)), and the total A 1 width decreases. Hence the decay length increases, and a smaller fraction of A 1 's decay before 2 m. This loss can be compensated for if two states H 2 and H 3 with large production cross sections and branching fractions into A 1 A 1 contribute to the signal.
The benchmark point BP2 is of this type, where we take κ = 0.75, nearly (but not quite) small enough for the absence of a Landau singularity below the GUT scale. For M A 1 ∼ 211.1 MeV the total A 1 width is ∼ 6.5 × 10 −14 GeV, leading to l ∼ 5.5 m. We estimate that then only ∼ 30% of all A 1 decays take place before the EM calorimeter cells. On the other hand, two Higgs states H 2 and H 3 with masses near 730 GeV and 762 GeV contribute to the signal. Both are strong mixtures of the pure MSSM-like and singlet-like states. For H 2 , the production cross section is σ ggF +bbH (H 2 ) ∼ 69.4 fb, and BR(H 2 → A 1 A 1 ) ∼ 0.66. For H 3 , the production cross section is σ ggF +bbH (H 3 ) ∼ 54.3 fb, and BR(H 2 → A 1 A 1 ) ∼ 0.53. Together with a BR(A 1 → γγ) ∼ 0.73 we obtain a signal rate of ∼ 3.7 fb times Acc(γ), as shown in Table 2  For the points BP1 and BP2, the signal cross section takes into account losses from A 1 decays beyond 2 m according to a factor 1 − e −2/l 2 with l in m. For the points BP3 and BP4 the BR(A 1 → γγ) includes the BR(A 1 → 3π 0 ). Acc(γ) denotes the acceptance for the di-or multiphotons from two pseudoscalars to simulate two single photons in the detector as discussed in section 3.3. M H1 allows for a theoretical error of ∼ 3 GeV. For BP4, Γ tot (A 1 ) and l(A 1 ) are the unreliable parton level results (at NLO).

M A 1 at 510-550 MeV
The partial widths of a light pseudoscalar in this mass range can be estimated employing two complementary approaches. To begin with one can ask what one would obtain within the parton model, extrapolated into the nonperturbative domain of QCD. First, for a reduced coupling of A 1 to leptons X d ∼ 0.1 as considered below, the partial width of A 1 into muons can still be computed reliably and is Γ( The loop induced partial width of A 1 into γγ is ∼ 4 × 10 −15 GeV and hence negligibly small. At NLO QCD the partial width of A 1 into strange quarks is about 5 × 10 −10 GeV and the loop induced width into gluons of the same order as the width into µ + µ − . These widths can only be rough estimates, however. An alternative approach is to consider the case M A 1 ≈ 550 MeV, where one can expect that A 1 mixes with the η meson with a mass of 547.85 MeV. (The possible rôle of η for the decays of a light pseudoscalar has been indicated earlier in [54] without quantitative statements, however.) Mixing with the π 0 meson of a lighter A 1 with M A 1 ≈ 135 MeV has been considered in [38], where Partial Conservation of Axial Currents (PCAC) or the sigma model for light mesons is employed in order to determine the off-diagonal element of the A 1 -meson mass matrix; the same formalism will be used here for A 1 − η mixing for M A 1 ≈ 550 MeV.
First we discuss this latter case, where the results can be considered as more reliable. Only subsequently we turn to the case M A 1 ≈ 510 MeV, motivated by the alleviation of constraints from radiative Υ(1S) decays in this mass range, see below. There, however, estimates of partial widths of A 1 are more speculative.
For M A 1 ≈ 550 MeV, the relevant mass matrix of the A 1 − η system reads in the basis (A 1 , η) For a small mixing angle θ, the eigenstate A 1 contains a small η component: The dominant η decays are [61] BR Next we require that the η-induced decays into γγ or 3π 0 of the eigenstate A 1 dominate its width into µ + µ − , since we ignore the unreliable widths of A 1 into strange quarks or gluons in this subsection. (Since the latter decays can also generate γγ or 3π 0 final states, this assumption is conservative.) This leads to In order to estimate the mixing matrix element δm 2 A 1 η above we use, following [38], PCAC. There one introduces the SU(3) axial flavour currents J µ A i where i denote the SU(3) generators. Assuming that η is a pure octet, J µ A 8 satisfies with f π ∼ 93 MeV. At the quark level one has where J µ A 0 is the (anomalous) U(1) A current whose divergence involves the η meson. Using these relations, one can re-write the coupling of A 1 to strange quarks in the Lagrangian (proportional to the corresponding Yukawa coupling X d m s /v) where we have dropped the terms ∼ ∂ µ J µ A 0 ∼ η . From (3.11) one can read off Then the request (3.8) becomes, again for X d ∼ 0.1 and using (3.5), This estimate can be refined by including mixing with the η meson, the anomalous U(1) A current J µ A 0 and the loop-induced coupling of A 1 to F µν F µν , where F µν is the QCD field strength 2 . The additional contribution to δm 2 A 1 η leads to a replacement of the right hand side of (3.13) by ∼ 1 MeV.
Assuming such a small A 1 − η mass difference, the decay length of A 1 is below a mm, and its branching fractions are the ones of η given in (3.7) above. For M A 1 > M K , constraints from rare K decays are no longer relevant. However, since A 1 has couplings to b-quarks ∼ X d , constraints from the search for the radiative decays Υ(1S) → γ η by CLEO in [49] apply (and are more relevant than searches for Υ(1S) → γA 1 → γµ + µ − ).
The BR(Υ(1S) → γA 1 ) can be obtained from the Wilczek formula [65,66] where BR(Υ(1S) → µ + µ − ) ∼ 2.48% and F is a correction factor ∼ 0.5. The upper bound of CLEO [49] on BR(Υ(1S) → γη) is 1.0 × 10 −6 at the 90% CL level, or 1.3 × 10 −6 at the 95% CL level. Applying this bound to the BR(Υ(1S) → γA 1 ), (3.14) gives X d < ∼ 0.11 (3.15) as used above. From (2.19) one finds that κ must then also be quite small, leading to relatively small branching fractions of the heavy Higgs states H 2 and H 3 into A 1 A 1 . Hence both of these states should contribute to the signal. Next we consider the A 1 decays induced by its mixing with η where η decays as in (3.7). The decays into π 0 π + π − give diphotons plus muons, but due to the escaping neutrinos this final state will not allow to reconstruct the masses of the original resonances near 750 GeV. In addition to the A 1 → γγ mode, the A 1 → 3π 0 mode leads to photon jets. The compatibility of such photon jets with a single photon signature in the detectors has been discussed in detail in [35]. In particular, due to the enhanced probability for photon conversions into e + + e − in the inner parts of the detectors, such scenarios can be distinguished from single photons (or even diphotons) once more events are available. Adding both modes, about 72% of all A 1 decays lead to di-or multi-photons. The resulting signal cross section remains to be multiplied by the acceptamce Acc(γ) for the di-or multi-photons to fake a single photon discussed in section 3.3.
The parameters, masses, branching fractions, production and signal cross sections of a corresponding benchmark point BP3 are shown in Tables 1 and 2. If M A 1 differs by a few tens of MeV from the η mass it becomes more difficult to estimate its decays; its mixing angle with the on-shell η meson using PCAC as above becomes tiny. Its Yukawa couplings to Standard Model fermions are obtained through mixing with the (heavy) MSSM-like pseudoscalar A M SSM . At the parton level and for tan β ∼ 10 − 15, the relative couplings squared and hence the corresponding widths of A 1 are dominantly into ss (≈ 87%), into gg via top quark loops (≈ 5%), and into µ + µ − (≈ 8%). The hadronic or γγ decays of A 1 can then be considered as being mediated by the CP-odd isospin and color singlet interpolating composite fields ss and F µν F µν . Both are known components of the η wave function in Fock space, and the most reasonable assumption is that their hadronisation (decays into physical hadrons and γγ) proceeds again with branching fractions similar to the ones of η.
The partial width for the sum of these decays of A 1 is less clear, however. It is relevant, since it competes with the width of A 1 into µ + µ − and determines consequently the branching fraction BR(A 1 → hadrons or γγ) via the above interpolating fields relative to the BR(A 1 → µ + µ − ). Since the widths for the above mentioned decays of η into γγ or pions are small (being electromagnetic or suppressed by isospin), one must assume that the widths for the decays of A 1 via the above interpolating fields are also smaller than estimated from the couplings squared at the parton level as at the beginning of this section. A quantitative statement is difficult, however, without a nonperturbative evaluation of the relevant matrix elements between physical states.
On the other hand, the sum of the couplings squared of A 1 to ss or gg and hence the sum of the partial widths of A 1 into ss or gg (for both of which η-like branching fractions are assumed) is considerably larger than into µ + µ − : At NLO one has BR(A 1 → ss or gg) ∼ 0.92. Hence, reducing the sum of the partial widths of A 1 into ss or gg by a factor 1/10 leaves us still with a dominant BR(A 1 → ss or gg) ∼ 0.55.
In the scenario where M A 1 differs by a few tens of MeV from the η mass we will make the assumption that the reduction of the width of the decays A 1 → γγ or hadrons is not too dramatic, i.e. the relevant branching fractions of A 1 can be parametrized as where the factor F A is not too small ( > ∼ 0.1). Let us have another look at the searches by CLEO which were performed separately for the η → 3π 0 , η → π 0 π + π − and η → γγ final states. The windows for the invariant masses were chosen differently for different final states, M 3π 0 > ∼ 475 MeV and M π 0 π + π − > ∼ 515 MeV. (M γγ is fitted to a double Gaussian function centered at M η .) No candidates were found in the (background free) 3π 0 and π 0 π + π − final states, but two events in π 0 π + π − with M π 0 π + π − ∼ 510 MeV just below the M π 0 π + π − window. Also a mild excess of events for M γγ ∼ 510 MeV is observed. These events are not numerous enough to allow for the claim of a signal, but we conclude that the π 0 π + π − and γγ final states do not lead to stronger upper limits on BR(Υ(1S) → γA 1 ) for M A 1 ∼ 510 MeV than the limit on BR(Υ(1S) → γη) from the remaining 3π 0 final state. After translating the 90% CL upper limit from the latter final state into a 95% CL upper limit, we find from only the 3π 0 final state where 0.33 is the BR(η → 3π 0 ). For A 1 , assuming η-like decays, this upper bound becomes Since smaller F A alleviates the constraint on κ, we should study its impact on the signal rate. First, the branching fractions of the heavy Higgs states induced by the coupling g SA 1 A 1 in (2.17) behave roughly like g 2 SA 1 A 1 ∼ κ 4 < ∼ 0.002 F −2 A , i.e. smaller F A allows for larger branching fractions. On the other hand, by assumption (see (3.16)) the branching fractions of A 1 into γγ or 3π 0 are proportional to F A . Hence the factors of F A cancel approximatively in the final signal rate. (Depending on the other parameters we found, however, that smaller F A can lead to a decrease of the signal rate if the heavy Higgs branching fractions increase somewhat less than indicated above.) For a rough estimate we have constructed a benchmark point BP4 with M A 1 ∼ 510 MeV. It has X d 0.206 and satisfies the CLEO constraints for F A < 0.87. Its parameters are given in Table 1. For the branching fractions of A 1 into γγ or 3π 0 we assume F A ∼ 0.87 which gives, using the corresponding branching fractions of η, BR(A 1 → γγ or 3π 0 ) ∼ 0.87 × (0.39 + 0. 33) 0.63. Together with the production cross sections and branching fractions of the Higgs states H 2 and H 3 in Table 2 we obtain finally a signal cross section of ∼ 6.7 fb times Acc(γ).

The spread in rapidity of multiphotons
The probability for a di-or multiphoton system to fake a single photon depends to a large extent on its angular spread. In the context of a decay of the Standard Model Higgs boson into light pseudoscalars (which decay into diphotons) this has been discussed in some detail in [69]. One has to consider the fineness in rapidity η of the strips of the first layer of the electromagnetic calorimeter (EM), ranging from 0.003 to 0.006 (depending on η) for ATLAS [70]. A relevant quantity for an event is the fraction of the total deposited energy in a single strip, and which fraction is deposited in the adjacent strips [69]. In the case of diphotons, the relevant criterium is then the distribution of ∆η between the two photons. In [69] it is argued (and used in [34]) that only for ∆η < ∼ 0.0015 prompt diphotons fake a single photon. (Including converted photons does not seem to modify this estimate [69].) This number remains to be confirmed by the experimental collaborations, however, and will depend on η and the transverse energy of the individual events in practice.
In the case of displaced vertices, as for our benchmark points BP1 and BP2, the situation is more involved as discussed in [34]: diphotons from pseudoscalars decaying between the original vertex and the EM are more collimated. Note that the signal cross sections given in Table 2 (without the factor Acc(γ)) take already into account the loss from pseudoscalars decaying inside or beyond the EM. The interplay between these losses and ∆η for diphotons reaching the EM has been studied in [34], but is beyond the scope of the present paper. We content ourselves with the fact that Acc(γ) for our benchmark points BP1 and BP2 is then definitively larger than Acc(γ) for pseudoscalars of a corresponding mass of ∼ 200 MeV which decay promptly. To this end we studied the distribution of ∆η for diphotons from promptly decaying pseudoscalars with the help of a simulation based on MadGraph/MadEvent [71] including Pythia 6.4 [72], where the pseudoscalars originate from a 750 GeV Higgs state. The resulting distribution is shown in red in Fig. 1.
We find that 70% of these diphotons satisfy ∆η < ∼ 0.0015 (in rough agreement with [34]), and correspondingly more due to the displaced vertices for our benchmark points BP1 and BP2. A rough estimate for Acc(γ) for the diphotons from both pseudoscalars to fake a single photon would then be ≈ 75%. We underline, however, that the actual number of signal events responsible for the observed excesses is small, and statistical fluctuations of quantities like ∆η can be correspondingly large.
Turning to the scenarios BP3 and BP4, displaced vertices are no longer relevant. However, here we expect about equal fractions of both diphoton and 6-photon final states, the latter from A 1 → 3π 0 . For M A 1 ∼ 500 MeV, the spread in ∆η for the diphotons is obviously larger, as shown in Fig. 1 in green. Here only ∼ 25% of all diphotons satisfy ∆η < ∼ 0.0015.
For the 6-photon final state we note that the angular spread between the photons should be  smaller, as the total invariant mass of the system must remain the same and less energy is available for momenta transverse to the principal axis. Since the relevant quantity in the EM calorimeters is the spread of the deposited energy, we concentrate here on ∆η between the two most energetic photons resulting from a single A 1 → 3π 0 decay. It is shown in blue in Fig. 1, and we find that in ∼ 75% of all cases these are closer than ∆η ∼ 0.0015. This number is suggestive, but it is not clear whether it coincides with the fraction of 6-photon final states faking a single photon; due to the more homogenous distribution of the deposited energy this fraction could even be larger. If we use it as it is, the Acc(γ) for the di-or multiphotons from both pseudoscalars to fake simultaneously a single photon becomes Acc(γ) ≈ 25%. Moreover many of 6 photons will convert, and the acceptance of such events remains to be studied by the experimental collaborations. Again, statistical fluctuations can be large as long as the number of signal events is as low as at present.

Summary and Conclusions
We have shown that the excess of events in the diphoton final state near 750 GeV observed by ATLAS and CMS can be explained within a fairly standard supersymmetric extension of the Standard Model, the NMSSM, without invoking new particles like additional vector-like quarks and/or leptons. The signal cross sections are not very large, but may be sufficient to explain the observed excesses.
The corresponding processes differ, however, from what has been proposed in most of the literature up to now: Except for the scenario BP1 (with κ ∼ 1.65), two resonances nearby in mass which share the properties of the additional CP-even scalars of the NMSSM are responsable for the signal cross section. Their components proportional to the MSSM-like scalar H lead to enhanced couplings to b-quarks implying sizeable production cross sections via associated production with bquarks, whereas their components proportional to the singlet-like scalar S lead to sizeable branching fractions into two pseudoscalars A 1 A 1 . These scenarios are not in tension with the upper limit from CMS on the diphoton cross section obtained at the run I at 8 TeV [4]; note that the bbH cross sections increase somewhat faster with the c.m. energy than ggF .
We note that constraints from other decay modes of the scalars with masses of about 750 GeV are satisfied: Upper bounds on signal cross sections into other final states -quark pairs, lepton pairs and electroweak gauge bosons -are discussed in [5,67,68]. These bounds are obeyed given the relatively large branching fractions into A 1 A 1 and finally into di-or multi-photon final states of the heavy scalars in our scenario, which do not require excessive production cross sections.
Four different scenarios have been discussed, which differ in the properties and masses of the light pseudoscalars A 1 and the heavy Higgs states: 1) For the benchmark points BP1 and BP2, the mass M A 1 ∼ 211 MeV is just below twice the muon mass. Then the branching fraction of A 1 into diphotons is large enough for a satisfactory signal rate. In the case of the BP1 with κ ∼ 1.65, a single heavy Higgs state (still a mixture of the MSSM-like and singlet-like states) with a width of ∼ 7 GeV is sufficient for a signal. In the case of the BP2 with a more modest value for κ ∼ 0.75, two nearby heavy Higgs states, both with a width of ∼ 6 GeV, are responsible for the signal. For both BP1 and BP2, the dominant constraints from low energy experiments originate from K decays involving loop-induced flavour changing vertices of A 1 ; it must be assumed that these are cancelled by suitable choices of the SUSY breaking parameters.
2) For the benchmark points BP3 and BP4 it is assumed that A 1 shares its branching fractions with the η meson. In the case of BP3 with M A 1 ∼ 549 MeV this is guaranteed by A 1 − η mixing, estimated with the help of the PCAC formalism. In the case of BP4 with M A 1 ∼ 510 MeV, estimates of the A 1 partial widths are on less solid ground. We assumed that the non-leptonic decays of A 1 proceed via ss or gg ∼ F µν F µν interpolating fields which, in turn, hadronise (decay) again similar to the η meson. We showed, however, that sizeable reductions of the corresponding partial widths with respect to the decays of A 1 into ss or gg by, e.g., a factor 1/10, would not invalidate this scenario. For both BP3 and BP4 two nearby heavy Higgs states with widths of ∼ 1.5 − 2 GeV are responsible for the signal. For this range of M A 1 the dominant constraints from low energy experiments originate from searches for radiative Υ decays into γ + η by CLEO. These lead to upper bounds on the coupling of A 1 to down-type quarks and leptons and, as we have shown, on κ.
Interestingly, the four scenarios have different features which allow to distinguish them experimentally also from more "conventional" models: 1) For BP2, BP3 and BP4 the signal originates from two resonances H 2 and H 3 close in mass, which can imitate a single wide resonance. Of course, small variations of the parameters allow to vary the masses of H 2 and H 3 , the total signal rate, and to reshuffle the individual signal rates of H 2 and H 3 . With more events (and depending on the actual mass difference) the two states could possibly be resolved. A particular feature of BP1 is that the single resonance near 750 GeV responsible for the signal has another large branching fraction of ∼ 25% into Z + A 1 . With A 1 imitating a photon, one obtains signals of the kind Z + γ similar to the ones expected if a 750 GeV resonance decays into γγ via fermionic loops.
2) The A 1 decays differ considerably for the benchmark points. For BP1 and BP2 the decay lengths of A 1 are macroscopic leading to measurable displaced vertices if A 1 decays inside the calorimeters. BP2 corresponds to an extreme case with a decay length of ∼ 5.5 m, but a very large signal cross section (before reducing it by the number of decays before the EM calorimeters). For both BP1 and BP2, A 1 decays into diphotons. However, due to the displaced vertices it will not be straightforward to distinguish them from single photons via the number of converted photons [35]. Moreover, A 1 has branching fractions of ∼ 25 − 30% into e + + e − leading to similar signatures as converted photons. Also the opening angle between the two photons gets reduced for displaced vertices increasing their acceptance as a single photon, see section 3.3. For BP3 and BP4 the A 1 decay lengths are short, but A 1 decays into diphotons or photon jets from 3π 0 . The latter should lead to a very large proportion to "converted photons"; additional potentially relevant observables like EM shower shapes have also been discussed in [35]. Our results for M A 1 ∼ 500 MeV for ∆η between diphotons or the two leading photons from jets from 3π 0 may be useful here. Finally, for BP3 and BP4 A 1 has branching fractions into muon pairs (of 6 − 8% at the parton level) which could be used for alternative signals, once more events are obtained.
Hence, if the excess of events continues, several observables can be used to verify/test/exclude the scenarios discussed here.
Finally we recall that the origin of the light pseudoscalar A 1 in the NMSSM is an approximate R-symmetry of the scalar potential, see the small values of A κ and A λ of the benchmark points. For BP1, BP2 and BP3, M A 1 has to coincide accidentially with specific values 2 m µ or m η . For the BP4 M A 1 is actually less constrained (unless one intends to fit the events near M A 1 ∼ 510 MeV observed by CLEO as we did), but in this scenario the A 1 decays are less understood theoretically. Although the approximate R-symmetry at the weak (or SUSY) scale is not preserved by radiative corrections, we content ourselves in the present paper with the mere fact that such a scenario would allow to explain the events. Work on an R-symmetric extension of the NMSSM explaining a light pseudo-Goldstone boson naturally is in progress.