1 Introduction

A detailed analysis of Higgs boson (H) decays into \(\tau \)-lepton pairs observed at the LHC [1,2,3] allows a direct probe of the charge conjugation and parity (\(\textit{CP}\)) properties of the Yukawa coupling of the Higgs boson to the \(\tau \)-lepton. The Standard Model (SM) of particle physics predicts the Higgs boson to be a \(\textit{CP}\)-even (scalar) particle. The presence of a \(\textit{CP}\)-odd (pseudoscalar) admixture has not yet been excluded, and any observed \(\textit{CP}\)-odd contribution to the \(H\tau \tau \) coupling properties would be a sign of physics beyond the SM.

Studies of the \(\textit{CP}\) properties of Higgs boson interactions with gauge bosons performed by the ATLAS and CMS experiments [4,5,6,7,8,9] have shown no deviation from the SM predictions. However, these measurements probe the bosonic couplings in which \(\textit{CP}\)-odd contributions enter only via higher-order operators that are suppressed by powers of 1/\(\Lambda ^2\) [10], where \(\Lambda \) is the scale of the new physics in an effective field theory. In contrast, a \(\textit{CP}\)-odd contribution to Yukawa couplings can be present at tree level. Recently, measurements of the \(\textit{CP}\) properties of the interaction between the Higgs boson and top quarks were performed by the ATLAS [11] and CMS [12] Collaborations, and excluded a pure \(\textit{CP}\)-odd structure for the top-quark Yukawa coupling at 3.9\(\sigma \) and 3.2\(\sigma \), respectively.

This paper presents a measurement of the \(\textit{CP}\) properties of the Higgs boson interaction with \(\tau \)-leptons. The measurement is based on \(\textit{CP}\)-sensitive angular observables defined using the visible \(\tau \)-lepton decay products. Ideas about how to probe a \(\textit{CP}\)-odd and \(\textit{CP}\)-even admixture in the \(\tau \)-lepton Yukawa coupling in \(H \rightarrow \tau \tau \)  decay were initially developed in the context of \(e^+e^-\) colliders [13,14,15,16,17]. Originally, hadronic decays of the \(\tau \)-leptons into \(\pi ^{\pm } \nu \) and \(\rho ^{\pm } \nu \) were used, and observables sensitive to the transverse spin correlations between the \(\tau \)-lepton decay products were constructed. These methods, extended to \(\ell ^{\pm } (=e^{\pm },\mu ^{\pm })\nu \nu \) and \(a_1^{\pm } \nu \) decays and re-evaluated in the context of pp collisions at the LHC [18,19,20,21,22,23], are adopted in this analysis. Recently, a similar study was also performed by the CMS Collaboration [24].

The general effective Yukawa interaction between the Higgs boson and \(\tau \)-leptons can be parameterised as in Refs. [21, 23]:

$$\begin{aligned} {\mathcal {L}}_{H\tau \tau } = - \frac{m_{\tau }}{\upsilon } \kappa _{\tau } (\cos \phi _{\tau }\bar{\tau }\tau + \sin \phi _{\tau }\bar{\tau }i\gamma _{5}\tau )H, \end{aligned}$$

where \(\upsilon = 246\) GeV is the vacuum expectation value of the Higgs field, \(\kappa _{\tau }\) is the reduced Yukawa coupling strength, and \(\phi _{\tau }\) (where \(\phi _{\tau } \in [-90^{\circ }, 90^{\circ }]\)) is the \(\textit{CP}\)-mixing angle that parameterises the relative contributions of the \(\textit{CP}\)-even and \(\textit{CP}\)-odd components to the \(H\tau \tau \) coupling. The SM \(\textit{CP}\)-even hypothesis is realised for \(\phi _{\tau } = 0^{\circ }\), while the pure \(\textit{CP}\)-odd scenario corresponds to \(\phi _{\tau } = \pm 90^{\circ }\). Other values of \(\phi _{\tau }\) represent admixtures of the two components and would indicate a \(\textit{CP}\)-violating scenario.

Fig. 1
figure 1

Illustration of the \(\tau \)-lepton decay planes for constructing the \(\varphi ^{*}_{ CP }\)  observable in a \(H\rightarrow \tau ^{+}\tau ^{-}\rightarrow \pi ^{+}\pi ^{-}+2\nu \) decay using the impact parameter method, b \(H\rightarrow \tau ^{+}\tau ^{-}\rightarrow \pi ^{+}\pi ^{0}\nu \pi ^{-}\pi ^{0}\nu \) using the \(\rho \)-decay plane method, and c \(H\rightarrow \tau ^{+}\tau ^{-}\rightarrow \pi ^{+}\pi ^{0}\nu \pi ^{-}\nu \) using the combined impact parameter and \(\rho \)-decay plane method. The decay planes are spanned by the spatial momentum vector of the charged decay particle of the \(\tau \)-lepton (\(\pi ^{\pm }\)) and either its impact parameter \({\textbf{n}}^{*\pm }\) or the spatial momentum vector of the neutral decay particle of the \(\tau \)-lepton (\(\pi ^{0}\)  )

The \(\textit{CP}\)-mixing angle \(\phi _{\tau }\) is encoded in the correlations between the transverse spin components of the \(\tau \)-leptons in the \(H \rightarrow \tau \tau \) decays, which are then reflected in the directions of the \(\tau \)-lepton decay products. The signed acoplanarity angle \(\varphi ^{*}_{ CP }\)  between the \(\tau \) decay planes (described in Sect. 3 and illustrated in Fig. 1) is sensitive to the transverse spin correlations impacted by the \(\textit{CP}\)-mixing angle of the Yukawa coupling. Such correlations are usually calculated by contracting the polarimeter vectors of the decayed \(\tau \)-leptonsFootnote 1 and the spin density matrix of the \(\tau \)-lepton-pair spin state, \(R_{i,j}\), which depends on the \(\tau \)-lepton pair-production process [26,27,28]. In the case of Higgs boson decays, the spin density matrix \(R_{i,j}\) has only transverse components with respect to the \(\tau \)-lepton direction, and these are first-order trigonometric polynomials in the \(2 \phi _{\tau }\) angle. Per-event sensitivity to \(\textit{CP}\)-mixing depends on the \(\tau \)-lepton-pair decay modes and on the way in which the polarimeter vectors and decay planes are reconstructed from observable quantities. The \(\varphi ^{*}_{ CP }\)  angle is directly related to \(\phi _{\tau }\)in the \(H \rightarrow \tau \tau \)  differential decay rate and the relation has the form of a first-order trigonometric polynomial in \(\cos (\varphi ^{*}_{ CP }\,\, - 2 \phi _{\tau })\) at leading order [14, 21, 22]:

$$\begin{aligned} d\mathrm {\Gamma }_{H\rightarrow \tau ^{+} \tau ^{-}\,\,} \approx 1 - b(E_{+})b(E_{-})\frac{\pi ^2}{16}\cos (\varphi ^{*}_{ CP }\,\,-2\phi _{\tau }), \end{aligned}$$

where \(E_{\pm }\) are the energies of the charged decay particles in their respective \(\tau \)-lepton rest frames, and \(b(E_{\pm })\) are the spectral functions [29] describing the spin analysing power of a given decay mode. Different methods [15,16,17,18,19,20,21,22,23] have been developed in an attempt to approximately reconstruct \(\tau \)-lepton decay planes. The \(\varphi ^{*}_{ CP }\)  variable used in this analysis is constructed with various methods depending on the \(\tau \)-lepton decay modes, largely following the strategy presented in Ref. [23].

The analysis is performed using 139 fb\(^{-1}\) of \(\sqrt{s} = 13\) TeV proton–proton (pp) collision data recorded from 2015 to 2018 with the ATLAS detector. Two \(\tau \)-lepton-pair decay channels are considered in the analysis: the first with one leptonically (\(\tau _{\text {lep}}\)  ) and one hadronically decaying \(\tau \)-lepton (\(\tau _{\text {had}}\)  ), referred to as the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channel, and the second with two hadronically decaying \(\tau \)-leptons, referred to as the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel. The leptonic decay \(\tau ^{\pm }\rightarrow \ell ^{\pm }\nu \nu \) includes decays to either an electron or a muon. In the case of hadronic decay, the dominant \(\tau _{\text {had}}\)  decay modes are considered: single-pion decay \(\pi ^{\pm }\nu \), two-pion decay \(\pi ^{\pm }\pi ^{0}\nu \) with an intermediate \(\rho ^{\pm }\), and three-pion decay \(\pi ^{\pm } 2\pi ^{0}\nu \) and \(3 \pi ^{\pm }\nu \) with an intermediate \(a_1^{\pm }\). A small fraction of events with \(\tau \) decays to \(K^{\pm }\) mesons is also included in the analysis. The \(\tau \)-lepton decay modes used in the analysis are summarised in Table 1 with their branching fractions [30] and the notation used throughout this paper. The \(\tau _{\text {had}}\)  decay modes are labelled by YpXn in accord with the number of charged (Y) and neutral (X) pions among the decay products. The \(\tau \)-lepton-pair decay modes considered in this analysis account for 68% of all possible \(\tau \) pair decays.

Table 1 Notation for the dominant leptonic and hadronic \(\tau \)-lepton decay modes used and their branching fractions. The symbol ‘\(\ell ^{\pm }\)’ stands for \(e^{\pm }\) or \(\mu ^{\pm }\), and ‘\(h^{\pm }\)’ includes \(\pi ^{\pm }\) and \(K^{\pm }\). The parentheses show the hadronic decays involving \(\pi ^{\pm }\) and their corresponding branching fractions

This paper is structured as follows. In Sect. 2 the ATLAS detector is briefly described. The methodology and observables used in the analysis are discussed in Sect. 3. Section 4 gives a summary of the data and simulated event samples. Section 5 describes the object reconstruction and event selection, and defines the signal and control regions. Section 6 details the experimental and theoretical systematic uncertainties. The fit model and statistical analysis strategy are explained in Sect. 7. Section 8 presents the measurement results. Section 9 concludes the paper.

2 ATLAS detector

The ATLAS detector [31] at the LHC covers nearly the entire solid angle around the collision point.Footnote 2

It consists of an inner tracking detector surrounded by a thin superconducting solenoid, electromagnetic and hadron calorimeters, and a muon spectrometer incorporating three large superconducting air-core toroidal magnets.

The inner-detector system (ID) is immersed in a 2 T axial magnetic field and provides charged-particle tracking in the range \(|\eta | < 2.5\). The high-granularity silicon pixel detector covers the vertex region and typically provides four measurements per track, the first hit normally being in the insertable B-layer [32, 33] installed before Run 2. It is followed by the silicon microstrip tracker, which usually provides eight measurements per track. These silicon detectors are complemented by the transition radiation tracker (TRT), which enables radially extended track reconstruction up to \(|\eta | = 2.0\). The TRT also provides electron identification information based on the fraction of hits (typically 30 in total) above a higher energy-deposit threshold corresponding to transition radiation.

The calorimeter system covers the pseudorapidity range \(|\eta | < 4.9\). Within the region \(|\eta |< 3.2\), electromagnetic calorimetry is provided by barrel and endcap high-granularity lead/liquid-argon (LAr) calorimeters, with an additional thin LAr presampler covering \(|\eta | < 1.8\) to correct for energy loss in material upstream of the calorimeters. Hadron calorimetry is provided by the steel/scintillator-tile calorimeter, segmented into three barrel structures within \(|\eta | < 1.7\), and two copper/LAr hadron endcap calorimeters. The solid angle coverage is completed with forward copper/LAr and tungsten/LAr calorimeter modules optimised for electromagnetic and hadronic energy measurements respectively.

The muon spectrometer comprises separate trigger and high-precision tracking chambers measuring the deflection of muons in a magnetic field generated by the superconducting air-core toroidal magnets. The field integral of the toroids ranges between 2.0 and 6.0 T m across most of the detector. Three sets of precision chambers cover the region \(|\eta | < 2.7\) with multiple layers of monitored drift tubes, complemented by cathode-strip chambers in the forward region, where the background is highest. The muon trigger system covers the range \(|\eta | < 2.4\) with resistive-plate chambers in the barrel and thin-gap chambers in the endcap regions.

Interesting events are selected by the first-level trigger system implemented in custom hardware, followed by selections made by algorithms implemented in software in the high-level trigger [34]. The first-level trigger accepts events from the 40 MHz bunch crossings at a rate below 100 kHz, which the high-level trigger reduces in order to record events to disk at about 1 kHz.

An extensive software suite [35] is used in data simulation, in the reconstruction and analysis of real and simulated data, in detector operations, and in the trigger and data acquisition systems of the experiment.

3 Analysis strategy

A \(\textit{CP}\)-sensitive observable \(\varphi ^{*}_{ CP }\)  is built with different methods depending on the \(\tau \)-lepton decay modes. In general, \(\varphi ^{*}_{ CP }\)  is the signed acoplanarity angle between the \(\tau \)-lepton decay planes. Each \(\tau \) decay plane is constructed from the spatial momentum vector of a charged decay particle and either its impact parameter (impact parameter method) or the spatial momentum vectors of other visible \(\tau \)-lepton decay particles (\(\rho \)-decay plane and \(a_1\) methods). All vectors are boosted to the zero-momentum frame (ZMF) of the visible \(\tau \)-lepton-pair decay particles. Figure 1a–c illustrate the methods used to construct the \(\varphi ^{*}_{ CP }\)  observable in \(H\rightarrow \tau ^{+}\tau ^{-}\rightarrow \pi ^{+}\pi ^{-}+2\nu \), \(H\rightarrow \tau ^{+}\tau ^{-}\rightarrow \pi ^{+}\pi ^{0}\nu \pi ^{-}\pi ^{0}\nu \) and \(H\rightarrow \tau ^{+}\tau ^{-}\rightarrow \pi ^{+}\pi ^{0}\nu \pi ^{-}\nu \) decays, respectively. The visible \(\tau \)-lepton-pair ZMF (indicated by \(^{*}\)) is used to approximate the Higgs boson rest frame, which is not accessible due to the presence of undetected neutrinos in the \(\tau \)-lepton decays.

Figure 2 shows the normalised distribution of \(\varphi ^{*}_{ CP }\)  for simulated \(H\rightarrow \tau ^{+}\tau ^{-}\rightarrow \pi ^{+}\pi ^{-}+2\nu \) events at the generator level. The distribution peaks at \(\varphi ^{*}_{ CP }\)  = 180\(^{\circ }\) for a \(\textit{CP}\)-even (e.g. SM) Higgs boson, whereas for the case of a pure \(\textit{CP}\)-odd Higgs boson, the distribution peaks at \(\varphi ^{*}_{ CP }\)  = 0\(^{\circ }\) and 360\(^{\circ }\). The phase difference between the \(\varphi ^{*}_{ CP }\)  distributions for two different mixing scenarios is twice their \(\phi _{\tau }\)difference.

The \(\tau \)-lepton-pair decay combinations used in the analysis and the respective methods for constructing the \(\varphi ^{*}_{ CP }\)  observable are summarised in Table 2. The corresponding fraction of events relative to the total from all possible di-\(\tau \) decay combinations is calculated from the single-\(\tau \)-lepton decay mode branching fractions in Table 1. Other decay combinations are not considered in this analysis because their respective \(\varphi ^{*}_{ CP }\)  observables perform relatively poorly in discriminating between different \(\textit{CP}\) scenarios.

Fig. 2
figure 2

Normalised \(\varphi ^{*}_{ CP }\)  distributions in simulated \(H\rightarrow \tau ^{+}\tau ^{-}\rightarrow \pi ^{+}\pi ^{-}+2\nu \) events at the generator level for different \(\textit{CP}\) hypotheses. The predictions for a pure \(\textit{CP}\)-even SM Higgs boson (scalar, red circle), a pure \(\textit{CP}\)-odd hypothesis (pseudoscalar, green square), and \(\textit{CP}\)-mix hypothesis (\(\phi _{\tau }\)= 45\(^{\circ }\), blue triangle) are shown. The transverse momentum of the simulated \(\tau \) leptons is required to be larger than 30 GeV (20 GeV) for the leading (sub-leading) \(\tau \) lepton during the event generation

Table 2 Decay mode combinations of the \(\tau \)-lepton pair and the corresponding methods to construct the \(\varphi ^{*}_{ CP }\)  observable used in this analysis. The fraction of events for each decay mode combination relative to the total from all di-\(\tau \) decay combinations (last column) is calculated using the \(\tau \)-lepton decay mode branching fractions in Table 1

3.1 Impact parameter (IP) method

The IP method is applied to \(\tau \)-lepton decays with only one charged particle in the final state, specifically the direct hadronic decay \(\tau ^{\pm }\rightarrow \pi ^{\pm }\nu \) or leptonic decays \(\tau ^{\pm }\rightarrow \ell ^{\pm }\nu \nu \). This refers to the 1p0n–1p0n and \(\ell \)–1p0n decay mode combinations. In this case, the \(\tau \)-lepton decay plane is formed from the spatial momentum vector \(\mathbf {q^{\pm }}\) of the charged particle (\(\pi ^{\pm }\), \(\ell ^{\pm }\)) and the three-dimensional (3D) impact parameter vector \(\mathbf {n^{\pm }}\) of the charged particle, defined as the directional distance of closest approach of the charged particle’s track to the reconstructed primary vertex (PV) of the event. The four-vectors of the track momentum \(q^{\pm }_{\mu }\) and the impact parameter \(n^{\pm }_{\mu }\) = (0, \(\mathbf {n^{\pm }}\)), initially defined and measured in the laboratory frame, are boosted to the rest frame of the two charged decay particles (the visible di-\(\tau \) ZMF, denoted by \(^{*}\)). The boosted and normalised impact parameter vector \({\hat{\textbf{n}}^{*\pm }}\) is then decomposed into components which are parallel and transverse (\({\hat{\textbf{n}}_{\perp }^{*\pm }}\)) to the direction of the associated normalised spatial momentum vector \({\hat{\textbf{q}}^{*\pm }}\). Using these vectors, an angle \(\varphi ^{*}\) and a CP-odd triple correlation \({\mathcal {O}}^{*}_{CP}\) are defined as

$$\begin{aligned} \varphi ^{*} = \arccos ({\hat{\textbf{n}}}^{*+}_{\perp } \cdot {\hat{\textbf{n}}}^{*-}_{\perp }) \quad \text {and} \quad {\mathcal {O}}^{*}_{CP} = {\hat{\textbf{q}}}^{*-} \cdot ({\hat{\textbf{n}}}^{*+}_{\perp } \times {\hat{\textbf{n}}}^{*-}_{\perp }), \end{aligned}$$

and both are incorporated in a single observable \(\varphi ^{*}_{ CP }\)   (\(0 \le \varphi ^{*}_{ CP }\,\,\le 360^{\circ }\)) defined by

$$\begin{aligned} \varphi ^{*}_{ CP }\,\,= {\left\{ \begin{array}{ll} \hfil \varphi ^{*} &{} \quad \text {if } {\mathcal {O}}^{*}_{CP} \ge 0 \\ 360^{\circ } - \varphi ^{*} &{} \quad \text {if } {\mathcal {O}}^{*}_{CP} < 0. \end{array}\right. } \end{aligned}$$
(1)

In the case of leptonic decay, due to a different sign in the spectral function for the leptonic \(\tau \) decays [20, 29], an additional shift by \(180^{\circ }\) is applied to synchronise the phase in \(\varphi ^{*}_{ CP }\)  with the other decays.

3.2 \(\rho \)-decay plane (\(\rho \)) method

The \(\rho \) method is applied to construct \(\varphi ^{*}_{ CP }\)  in events with 1p1n–1p1n or 1p1n–1pXn decay mode combinations. In the case of consecutive decays \(\tau ^{\pm }\rightarrow \rho ^{\pm } \nu \), \(\rho ^{\pm }\rightarrow \pi ^{\pm }\pi ^{0}\), the \(\tau \)-lepton decay plane can be formed from the spatial momentum vectors of the charged pion (\({\textbf{q}}^{\pm }\)) and neutral pion (\({\textbf{q}}^{0\pm }\)). The four-momentum vectors of the \(\pi ^{\pm }\) and \(\pi ^{0}\) are boosted to the rest frame of the \(\rho \)-meson pair (the visible \(\tau \)-lepton-pair ZMF). The angle \(\varphi ^{*}\) and triple correlation \({\mathcal {O}}^{*}_{CP}\) are then defined in the same way as in the IP method using the unit spatial vectors, but replacing the impact parameter component with the neutral-pion vector,

$$\begin{aligned} \varphi ^{*} = \arccos ({\hat{\textbf{q}}}^{*0+}_{\perp } \cdot {\hat{\textbf{q}}}^{*0-}_{\perp }) \quad \text {and} \quad {\mathcal {O}}^{*}_{CP} = {\hat{\textbf{q}}}^{*-} \cdot ({\hat{\textbf{q}}}^{*0+}_{\perp } \times {\hat{\textbf{q}}}^{*0-}_{\perp }), \end{aligned}$$

where \({\hat{\textbf{q}}}^{*0+}_{\perp }\) and \({\hat{\textbf{q}}}^{*0-}_{\perp }\) are the normalised vectors transverse to the direction of the associated charged pion for each neutral pion. A signed observable \(\varphi ^{*\prime }\) is defined similarly to Eq. (1),

$$\begin{aligned} \varphi ^{*\prime } = {\left\{ \begin{array}{ll} \hfil \varphi ^{*} &{}\quad \text {if } {\mathcal {O}}^{*}_{CP} \ge 0 \\ 360^{\circ } - \varphi ^{*} &{}\quad \text {if } {\mathcal {O}}^{*}_{CP} < 0. \end{array}\right. } \end{aligned}$$

An additional requirement that depends on the sign of the product of \(\tau \)-lepton spin-analysing functions \( y_{\pm }^{\rho } = (E_{\pi ^{\pm }} - E_{\pi ^{0}})/(E_{\pi ^{\pm }} + E_{\pi ^{0}}), \) where \(E_{\pi ^{\pm ,0}}\) is the pion energy in the laboratory frame, is needed to define the observable \(\varphi ^{*}_{ CP }\)  sensitive to the \(\textit{CP}\)-mixing angle as

$$\begin{aligned} \varphi ^{*}_{ CP }\,\,= {\left\{ \begin{array}{ll} \hfil \varphi ^{*\prime } &{}\quad \text {if } y_{+}^{\rho }y_{-}^{\rho } \ge 0 \\ \varphi ^{*\prime } + 180^{\circ } &{}\quad \text {if } y_{+}^{\rho }y_{-}^{\rho } < 0. \end{array}\right. } \end{aligned}$$
(2)

In the case of 1pXn decays (e.g. \(\tau ^{\pm }\rightarrow a_{1}^{\pm } \nu \rightarrow \pi ^{\pm }2\pi ^{0}\nu \)), the sum of the four-momenta of all neutral pions is taken as the neutral component in the \(\rho \) method.

3.3 Combined IP and \(\rho \) (IP–\(\rho \)) method

For events with the combinations 1p0n–1p1n, 1p0n–1pXn, \(\ell \)–1p1n, and \(\ell \)–1pXn, the IP method and the \(\rho \) method are combined to compute the \(\varphi ^{*}_{ CP }\)  angle. In the case of \(H\rightarrow \tau \tau \rightarrow \pi ^{\mp }\nu \rho ^{\pm }\nu \), \(\rho ^{\pm }\rightarrow \pi ^{\pm }\pi ^{0}\) (1p0n–1p1n) events, \(\varphi ^{*}_{ CP }\)  is defined in the visible \(\tau \)-lepton-pair ZMF by using the \(\pi \rho \) rest frame. One of the decay planes is defined using the IP method and the other with the \(\rho \) method. The quantities \(\varphi ^{*}\), \({\mathcal {O}}^{*}_{CP}\) and \(\varphi ^{*\prime }\) are calculated in a way analogous to that in Sect. 3.2, but with the one-neutral-pion component replaced with the impact parameter vector defined in Sect. 3.1. The \(\varphi ^{*}_{ CP }\)  observable is defined as

$$\begin{aligned} \varphi ^{*}_{ CP }\,\,= {\left\{ \begin{array}{ll} \hfil \varphi ^{*\prime } &{}\quad \text {if } y^{\rho } \ge 0 \\ \varphi ^{*\prime } + 180^{\circ } &{}\quad \text {if } y^{\rho } < 0, \end{array}\right. } \end{aligned}$$
(3)

with a phase shift of \(180^{\circ }\) depending on the sign of \(y^{\rho }\).

3.4 \(a_{1}\)-decay (\(a_{1}\)) method

The \(a_{1}\) method is an extension of the \(\rho \) method discussed in Sect. 3.2, and is used for \(\tau ^{\pm }\rightarrow a_{1}^{\pm }\nu \), \(a_{1}^{\pm }\rightarrow \pi ^{\pm }\pi ^{+}\pi ^{-}\). The \(\tau \)-lepton decay plane is defined by the charged pion (\(\pi ^{\pm }_1\)) with the highest transverse momentum and the vector sum of the other two \(\pi \) momenta. The observable \(y_{\pm }^{\rho }\) used in the \(\rho \) method is modified to take the effect of the \(\pi \) masses into account, and is defined by \(y_{\pm }^{a_{1}}\) adopting the convention in Ref. [36] as

$$\begin{aligned} y_{\pm }^{a_{1}}=\frac{E_{2 \pi }-E_{\pi ^{\pm }_1}}{E_{2 \pi } + E_{\pi ^{\pm }_1}} - \frac{m_{3 \pi }^2-m_{\pi ^{\pm }_1}^{2}+m_{2 \pi }^{2}}{2m^2_{3 \pi }}, \end{aligned}$$

where \(m_{3\pi }\) is the invariant mass of the three charged pions from the \(a_{1}^{\pm }\) decay, and \(m_{2\pi }\) (\(E_{2\pi }\)) refers to the invariant mass (energy) of the system of the two \(\pi \) in the \(\tau \) decay that do not have the highest transverse momentum.

Similarly to Sect. 3.3, the \(a_1\) method is combined with the IP method for the \(\ell \)–3p0n events and with the \(\rho \) method for the 1p1n–3p0n events in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels, respectively. In the \(\ell \)–3p0n case, \(\varphi ^{*}_{ CP }\)  is defined similarly to Eq. (3), but with \(y^{\rho }\) replaced by \(y^{a_{1}}\). For 1p1n–3p0n, the \(\varphi ^{*}_{ CP }\)  computation is analogous to Eq. (2), except that the product \(y_{\pm }^{\rho }y_{\mp }^{a_{1}}\) is used to determine whether the \(180^{\circ }\) shift is applied.

4 Data and simulated event samples

This analysis uses 139 fb\(^{-1}\) of \(\sqrt{s} = 13\) TeV pp collision data recorded by ATLAS with good operating conditions [37] in 2015–2018. The data were collected using single-lepton or di-hadronic \(\tau \) triggers [38,39,40,41]. Events used in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channel were accepted by single-lepton triggers with \(p_{\text {T}}\,\,\) thresholds of 24 GeV (26 GeV) and 20 GeV (26 GeV) for electron and muon candidates, respectively, in the 2015 (2016–2018) dataset. Events used in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  analysis channel were accepted by di-hadronic \(\tau \)-lepton triggers, with a \(p_{\text {T}}\)  threshold of 35 GeV for the leading \(\tau \) candidate and 25 GeV for the sub-leading \(\tau \) candidate. Due to rising instantaneous luminosity, the di-hadronic \(\tau \) triggers used in the 2016–2018 data-taking period required an additional first-level triggered jet with \(p_{\text {T}}\,\,> 25\) GeV and \(|\eta |<\) 3.2.

The analysis considers the four main Higgs boson production processes at the LHC: gluon–gluon fusion (ggF), vector-boson fusion (VBF), associated production with a vector boson (VH), and associated production with top-quark pair (\(t {\bar{t}} H\)). The Powheg NNLOPS program [42,43,44,45,46] was used to model ggF Higgs boson production with next-to-next-leading-order (NNLO) accuracy. The VBF and VH production processes were simulated with Powheg at NLO accuracy in QCD. The production of \(t {\bar{t}} H\) events was simulated using Powheg Box v2 [44,45,46,47,48] at NLO with the NNPDF 3.0nlo [49] PDF set. In all signal events, the decays of \(\tau \)-leptons were modelled by Pythia 8 [50] with no spin correlations. The spin correlations are reintroduced with event-by-event weights modelling \(\textit{CP}\)-mixing-dependent transverse spin correlations, using the TauSpinner package [51,52,53]. Background samples of \(V+\hbox {jets}\) and diboson events were generated by Sherpa 2.2.1 [54] (including \(\tau \)-lepton decays), and \(t {\bar{t}}\) and single-top samples were generated by Powheg+Pythia 8, with Pythia also performing \(\tau \)-lepton decays. The simulated event samples are shared with the cross-section measurement in the \(H \rightarrow \tau \tau \)  decay channel [55].

All samples of simulated events were passed through a full simulation of the ATLAS detector response [56] using Geant4  [57]. The effects of multiple interactions in the same or neighbouring bunch crossings (pile-up) were modelled by overlaying each hard-scatter event with minimum-bias events, simulated using the soft QCD processes of Pythia 8. The simulated events were then weighted such that the distribution of the average number of interactions per bunch crossing matches the one observed in data.

5 Event selection and background estimation

The analysis of \(\textit{CP}\)-mixing in the \(H \rightarrow \tau \tau \)  channel requires the selection of signal events characterised by the presence of isolated leptons, visible decay products in hadronic \(\tau \) decays, jets and missing transverse momentum. Events with an isolated lepton (electron or muon) and a hadronic \(\tau \) decay are used for the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  analysis channel, whereas events with two hadronic \(\tau \) decays are used for the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel. Additional jets, the invariant mass and the transverse momenta of the \(\tau \) lepton pairs are used for event categorisation (Sect. 5.2). Various kinematic and identification requirements are used to suppress the background.

5.1 Objects and decay mode reconstruction

The analysis shares the same object reconstruction and identification algorithms as the \(H \rightarrow \tau \tau \)  cross-section measurement [55]. Here the most important features specific to this \(\textit{CP}\) measurement are recalled:

  • Tracks measured in the ID are used to reconstruct interaction vertices [58], of which the one with the highest sum of squared transverse momenta \(\sum p_{\textrm{T}}^{2}\) of the associated tracks is selected as the primary vertex of the hard interaction.

  • Electrons are reconstructed from topological clusters of energy deposits in the electromagnetic calorimeter which are matched to a track reconstructed in the ID [59] and are required to satisfy ‘loose’ isolation and ‘medium’ identification criteria.

  • Muons are reconstructed from signals in the muon spectrometer matched with tracks inside the ID. They are required to satisfy ‘loose’ identification and ‘tight’ isolation criteria based on track information [60].

  • Jets are reconstructed using a particle-flow algorithm [61]. It applies an anti-\(k_t\) algorithm [62, 63] with distance parameter \(R = 0.4\) to noise-suppressed positive-energy topological clusters in the calorimeter after subtracting energy deposits associated with primary-vertex-matched tracks, and including the track momenta instead in the clustering, thereby improving the jet energy measurement. Cleaning criteria are used to identify jets arising from non-collision backgrounds or noise in the calorimeters [64]. A dedicated jet-vertex-tagger algorithm [65] is used to remove jets that are identified as not being associated with the primary vertex of the hard interaction. Similarly, a dedicated algorithm is used to suppress pile-up jets in the forward region [66].

  • Hadronic \(\tau \)-lepton decays are reconstructed from the visible decay products (neutral pions or charged pions/kaons). The reconstruction is seeded by jets reconstructed with an anti-\(k_{t}\) algorithm using calibrated topological clusters [67] as inputs and a distance parameter of \(R = 0.4\) [68]. Reconstructed nearby tracks are matched to a hadronic \(\tau \) candidate (\(\tau _{\text {had-vis}}\)  ) if they exceed the value required for a multivariate discriminant determining the likelihood that the tracks are produced from the \(\tau _{\text {had-vis}}\)  decay. A recurrent neural network identification algorithm [69] is trained to separate the \(\tau _{\text {had-vis}}\)  candidates from jets initiated by quarks or gluons, and boosted decision tree (BDT) discriminants are used to help reject misidentified hadronic \(\tau \) decays due to electrons. In both cases, the \(\tau _{\text {had-vis}}\)  is required to satisfy ‘medium’ criteria. Reconstructed \(\tau _{\text {had-vis}}\)  objects are required to have one or three associated tracks, and have \(p_{\text {T}}\,\,> 20\) GeV and \(|\eta | < 2.47\), excluding the transition region between the barrel and endcap electromagnetic calorimeters (\(1.37< |\eta | < 1.52\)).

  • The missing transverse momentum (with magnitude \(E_{\text {T}}^{\text {miss}}\)  ) is reconstructed as the negative vector sum of the transverse momenta of leptons, \(\tau _{\text {had-vis}}\)  objects, jets, and a ‘soft-term’. The ‘soft-term’ is calculated as the vector sum of the \(p_{\text {T}}\)  of tracks matched to the primary vertex but not associated with reconstructed leptons, \(\tau _{\text {had-vis}}\)  objects, or jets [70].

  • The invariant mass of the \(\tau \)-lepton-pair system, \(m_{\tau \tau }^{\textrm{MMC}}\)  , is estimated with an advanced likelihood-based algorithm named the Missing Mass Calculator (MMC) [55, 71].

The transverse (longitudinal) impact parameter \(d_{0}\)  (\(z_{0}\)) of a charged-particle track is defined as the distance in the transverse plane (z direction) from the primary vertex to the track’s point of closest approach in the transverse plane. The impact parameters are used in the calculation of the \(\varphi ^{*}_{ CP }\)  observable in the cases involving single-pion or leptonic decays (Sect. 3.1), as well as in one of the variables defining the signal regions (Sect. 5.2).

An important addition in the \(\textit{CP}\) analysis is that it requires the \(\tau \)-lepton reconstruction algorithms to categorise the hadronic \(\tau \)-lepton decays by using the number of reconstructed tracks and the capability to distinguish between single-\(\pi ^0\) and multi-\(\pi ^0\) clusters. This allows the \(\tau _{\text {had}}\)  candidates to be classified as being from the 1p0n, 1p1n, 1pXn or 3p0n decay modes. For this measurement it is crucial to identify each decay mode with good efficiency and purity, and to measure the momenta of the charged and neutral pions. This is achieved through the ‘\(\tau _{\text {had}}\)  Particle Flow’ reconstruction algorithm [72], in which the four-momenta of the charged and neutral pions from the \(\tau \) decay are determined by combining measurements from the tracking detector and the calorimeter. The decay mode classification is performed by counting the number of charged and neutral pions, exploiting the kinematic properties of the \(\tau \)-lepton decay products and the number of reconstructed photons by using BDTs. Three BDTs are built respectively for the 1p0n vs 1p1n, 1p1n vs 1pXn and 3p0n vs 3pXn decay modes to improve the determination of the number of neutral pions in each case. The efficiency (purity) of the classification is about 80% (70–80%) for the dominant decay modes 1p0n and 1p1n; for the 3p0n decay mode the efficiency is over 90%, with 90% purity.

5.2 Event selection

Events are selected by requiring at least one hadronic \(\tau \)-lepton decay in the signature. For events in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)   channel, a \(\tau _{\text {had-vis}}\)  with \(p_{\text {T}}\,\,> 30\) GeV and an electron or muon with \(p_{\text {T}}\,\,> 21.0\) \(-\)27.3 GeV are required, with the latter \(p_{\text {T}}\)  cut depending on the data-taking period. In the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel, two \(\tau _{\text {had-vis}}\)  objects are required with \(p_{\text {T}}\)  above 40 and 30 GeV. The \(p_{\text {T}}\)  requirements are chosen to be above the \(p_{\text {T}}\)  thresholds of the respective single-lepton and hadronic \(\tau \) triggers to ensure trigger operation at the plateau efficiency. The \(\tau _{\text {had-vis}}\)  and the lepton are required to match geometrically with their trigger counterparts in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channels, respectively. In both channels the two \(\tau \)-lepton candidates are required to have opposite electric charges. The angular distances between the \(\tau \)-lepton candidates are required to be \(\Delta R_{\tau \tau } < 2.5\) and \(|\Delta \eta _{\tau \tau }| < 1.5\) (\(0.6< \Delta R_{\tau \tau } < 2.5\) and \(|\Delta \eta _{\tau \tau }| < 1.5\)) to reject non-resonant events in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  (\(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  ) channel. All events require at least one jet with \(p_{\text {T}}\,\,> 40\) GeV. Due to triggering conditions, this selection is tightened to \(p_{\text {T}}\,\,> 70\) GeV and \(|\eta | < 3.2\) in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel. The \(E_{\text {T}}^{\text {miss}}\)  is required to be greater than 20 GeV, and requirements on the fraction of the \(\tau \)-lepton momentum carried by its visible decay products are applied to further improve the invariant mass estimation. In the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channel, the transverse mass \(m_\text {T}\)  of the lepton-plus-\(E_{\text {T}}^{\text {miss}}\)  system is required to be less than 70  GeV in order to efficiently reject \({{W}}\text {+\,jets}\)  processes. These preselection criteria are the same as in the \(H \rightarrow \tau \tau \)  cross-section measurement [55].

Events are further categorised to target the vector-boson fusion (VBF category) and gluon–gluon fusion (Boost category) Higgs boson production modes. The VBF category contains events with a second high-\(p_{\text {T}}\,\,\) jet with \(p_{\textrm{T}}^{j_2} > 30\) GeV. The two leading jets are required to satisfy the following kinematics: \(|\Delta \eta _{jj}| > 3.0\), \(\eta _{j_1} \cdot \eta _{j_2} < 0\), and \(m_{jj} > 400\) GeV, with the pseudorapidity values of \(\tau _{\text {lep}}\)  or \(\tau _{\text {had}}\)  lying between those of the two leading jets (yielding ‘central \(\tau \)-leptons’). This kinematic selection is enhanced by splitting the VBF category into two regions, VBF_1 and VBF_0, based on the output of a BDT-based VBF tagger [55], with the VBF_1 region having an enhanced fraction of VBF Higgs boson production events. The Boost category targets events where the Higgs boson recoils against jets. The \(p_{\text {T}}\)  of the Higgs boson (\(p_\textrm{T}^{\tau \tau }\)) is computed as the magnitude of the vector sum of the transverse momenta of the \(\tau \)-leptons’ visible decay products and the missing transverse momentum. Events with \(p_{\textrm{T}}^{\tau \tau }> 100\) GeV that do not pass the VBF selection form the Boost category. These events are further separated into those with \(\Delta R_{\tau \tau } < 1.5\) and \(p_{\textrm{T}}^{\tau \tau } > 140\) GeV (Boost_1) and those not passing these selections (Boost_0). The event categorisation is applied to both the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels, as summarised in Table 3.

For each category, a Higgs-enriched signal region is defined with \(110< m_{\tau \tau }^{\textrm{MMC}}\,\,< 150\) GeV, and a \(Z \rightarrow \tau \tau \)  control region (CR) is defined with \(60< m_{\tau \tau }^{\textrm{MMC}}\,\,< 110\) GeV, with the other selection criteria for each category remaining the same. Two additional control regions are defined using events with a \(\tau ^{\pm }\rightarrow \rho ^{\pm } \nu \rightarrow \pi ^{\pm }\pi ^{0}\nu \) decay in the \(60< m_{\tau \tau }^{\textrm{MMC}}\,\,< 110\) GeV range: one region with \(\ell \)–1p1n events for the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channel, and the other with 1p1n–1p1n events for the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel. The events in these regions are defined so as to be statistically independent of the four \(Z \rightarrow \tau \tau \)  control regions in Table 3. The use of the control regions is described in Sect. 7.

Table 3 Summary of selection criteria for the VBF and Boost categories in this analysis. The criteria are common to the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  decay channels

Depending on the decay mode combination, different additional selection criteria are applied to enhance the sensitivity of the \(\varphi ^{*}_{ CP }\)  construction method. The \(\varphi ^{*}_{ CP }\)  from the IP method is less effective in discriminating between different \(\phi _{\tau }\)values when the impact parameter vector has a magnitude smaller than, or similar to, its resolution. The sensitivity of this method can be enhanced by using events with high significance of the track impact parameter in the transverse plane, \(d_{0}^{\textrm{sig}}\)  , defined as the transverse impact parameter \(d_{0}\) divided by its resolution \(\sigma (d_{0})\). Events are therefore separated into two groups based on the value of \(|d_{0}^{\textrm{sig}}\,\,|\) of the lepton in \(\tau _{\text {lep}}\)  or the pion in \(\tau _{\text {had}}\)  . In the \(\rho \) method, events with larger absolute values of the product \(|y^{\rho }_{+}y^{\rho }_{-}|\) are more sensitive to \(\phi _{\tau }\). This quantity is also used to separate the events into two groups. In the case of the combined IP–\(\rho \) method, the sensitivity of \(\varphi ^{*}_{ CP }\)  is enhanced by separating the events based on the values \(|d_{0}^{\textrm{sig}}\,\,|\) and \(|y^{\rho }|\). For the IP–\(a_1\) and \(\rho \)\(a_1\) methods the separation is based on \(|y_{\pm }^{a_{1}}|\). Details of the additional selection criteria are summarised in Tables 4 and 5.

Among the \(\tau \)-lepton pair decays used in this analysis, the most sensitive (‘dominant’) decay mode combinations are 1p0n–1p1n, 1p1n–1p1n, and 1p0n–1p0n in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  decay channel, and \(\ell \)–1p1n and \(\ell \)–1p0n in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  decay channel, while the other combinations involving 1pXn and 3p0n are subdominant due to their weaker spin analysing power and smaller decay fractions. The events are therefore divided into three groups, the ‘High’, ‘Medium’ and ‘Low’ signal regions, which characterise the different levels of sensitivity. Events satisfying the additional selection criteria for the dominant and subdominant decay mode combinations define the High and Medium signal regions, respectively, while the rest are grouped into the Low signal region. This allows the \(\varphi ^{*}_{ CP }\)  distributions from the decay mode combinations with similar sensitivity to \(\textit{CP}\)-mixing to be merged to increase the statistical precision of the distribution templates within each signal region, allowing the use of finer binning in the \(\varphi ^{*}_{ CP }\)  distributions for the regions with better sensitivity. The splitting of the signal regions summarised in Tables 4 and 5 is applied in each of the VBF and Boost categories (VBF_1, VBF_0, Boost_1 and Boost_0).

Table 4 Summary of additional selection criteria for the signal regions in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channel
Table 5 Summary of additional selection criteria for the signal regions in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel

In this configuration, there are 12 signal regions in each of the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  decay channels, leading to 24 signal regions in total. Each signal or control region is orthogonal to the others.

5.3 Background estimation

Expected SM processes other than the \(H \rightarrow \tau \tau \)  signal are evaluated using a mixture of simulation and data-driven techniques. Processes with true \(\tau \)-leptons and prompt light leptons are estimated through simulation. Among those the \(Z(\rightarrow \tau \tau ) + \text {jets}\)  process is the dominant one, and dedicated control regions with \(60< m_{\tau \tau }^{\textrm{MMC}}\,\,< 110 \) GeV (Sect. 5.2) are used to extract its normalisation (Sect. 7).

The second most significant background contribution arises from jets misidentified as hadronically decaying \(\tau \)-leptons (\(\tau _{\text {had}}\)  ), referred to as ‘misidentified-\(\tau \) background’. It is determined by a data-driven approach using fake factors, as described in detail in Ref. [55]. The misidentified-\(\tau \) background consists mostly of \({{W}}\text {+\,jets}\)  , QCD multijet, and top-quark events in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channel, while in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel it originates mainly from multijet events. The fake factors are calculated as the ratio of the number of events passing the \(\tau \) identification requirements to the number failing them in dedicated QCD and \({{W}}\text {+\,jets}\)  -enriched regions, defined by inverting the lepton isolation or \(m_\text {T}\)  requirement, respectively. The fake factors are estimated in each hadronic \(\tau \) decay mode considered in the analysis, separately for VBF and Boost region events. To estimate both the shape and the normalisation of the misidentified-\(\tau \) background, distributions in the regions failing the \(\tau \) identification are multiplied by the fake factors to correct for the different efficiencies to pass or fail the \(\tau \) identification selection. In the ‘failed \(\tau \) identification’ regions, events not corresponding to misidentified-\(\tau \) background from jets are subtracted using simulated event samples. For the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  final state the fake-factor method is modified slightly: the \(\tau _{\text {had}}\)  objects are matched to their trigger counterparts and the estimate covers processes with one or two jets misidentified as \(\tau _{\text {had}}\)  . The misidentified-\(\tau \) background contribution is estimated for each di-\(\tau \) decay combination considered (Sect. 3) in the preselection region for the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  events.

Smaller background contributions are due to diboson, \(Z(\rightarrow \ell {}\ell ) + \text {jets}\)  and \(H\rightarrow WW^{*}\) processes in both decay channels, while \({{W}}\text {+\,jets}\)  and top production also make small contributions in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel. These are estimated from simulation and are normalised to their theoretical expectations.

6 Systematic uncertainties

Systematic uncertainties generally affect the yield in the signal and control regions as well as the shape of the \(\varphi ^{*}_{ CP }\)  distribution, on which the fit is performed. They are grouped into three types: the experimental uncertainties, the theoretical uncertainties, and the \(\tau \)-lepton decay reconstruction uncertainties.

The experimental uncertainties include those from the trigger, reconstruction, identification and isolation requirements of the final-state particle candidates (electrons, muons, \(\tau \)-leptons, jets, \(E_{\text {T}}^{\text {miss}}\)  , b-tagging), as well as uncertainties from misidentified-\(\tau \) background estimation and the luminosity measurement [73, 74].

The uncertainties on the jet energy resolution and scale [75] affect the acceptance of the signal and background contributions through their impact on \(E_{\text {T}}^{\text {miss}}\)  , and thus \(m_{\tau \tau }^{\textrm{MMC}}\)  [76]. They also directly affect the jet selection for the event categorisation in the VBF and Boost regions [76]. The jet energy scale uncertainty for central jets (\(|\eta | <1.2\)) varies from 1% for a wide range of jet \(p_{\text {T}}\)  (\(250~\textrm{GeV}< p_{\text {T}}\,\,< 2000\) GeV) to 5% for very low \(p_{\text {T}}\)  jets (\(p_{\text {T}}\,\,< 20\) GeV), and 3.5% for very high \(p_{\text {T}}\)  jets (\(p_{\text {T}}\,\,> 2.5\) TeV), and forward jets exhibit uncertainties of similar size. The relative jet energy resolution ranges from (\(24 \pm 1.5\))% to (\(6 \pm 0.5\))% for jets with \(p_{\text {T}}\)  of 20 GeV to 300 GeV, respectively [75]. Other jet-related uncertainties have smaller impacts.

Uncertainties from the misidentified-\(\tau \) background estimation arise from sample statistics of the events used for estimating the fake factors, subtraction of residual contributions from processes without misidentified-\(\tau \), and uncertainties in the flavour composition taken from comparisons between the predicted and observed backgrounds in a dedicated validation region. The uncertainties are assigned per \(\tau \)-lepton-pair decay combination in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels. The total expected uncertainties in the yield of the misidentified-\(\tau \) events in the signal regions are typically 20% (20–40%) in the Boost (VBF) region in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channel, and 15–30% in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel.

Theoretical uncertainties are applied to \(Z + \text {jets}\)  background and signal processes. For the \(Z + \text {jets}\)  background, uncertainties are considered for the renormalisation and factorisation scales, the resummation scale, the jet-to-parton matching scheme, the choice of value for the strong coupling constant \(\alpha _{s}\), and the choice of PDF. Only distribution shapes and migrations between analysis regions are considered for these uncertainties since the absolute normalisation of \(Z \rightarrow \tau \tau \)  is determined at the fit level. For the signal, uncertainties are considered for the QCD scale because of missing higher orders in the matrix element calculation, for the parton shower and hadronisation model, and for renormalisation and factorisation scales and the PDF. These are applied to the production cross-sections for the ggF, VBF and VH processes, and do not have large impacts on the measurement as they only affect the signal normalisation, which is also determined in the fit, and not the shape of the \(\varphi ^{*}_{ CP }\)  distribution.

The \(\tau \)-lepton decay reconstruction uncertainties primarily affect the shape of the \(\varphi ^{*}_{ CP }\)  distributions. These concern the classification of the hadronic \(\tau \)-lepton decay modes as well as measurement uncertainties of the track impact parameters, pion track momentum, and neutral pion momentum. The uncertainties in the classification of the hadronic \(\tau \)-lepton decay modes are derived from a \(\tau \)-lepton decay mode classification efficiency and correction factor measurement, using a tag-and-probe analysis in the \(\mu \)\(\tau _{\text {had}}\)  final state performed on part of the Run 2 dataset. These uncertainties include those in the decay mode reconstruction efficiencies and in the event migration between decay modes. The size of the uncertainties ranges from 6 to 20% depending on the decay modes and their cross migrations. The uncertainties affecting the impact parameters and track measurements include those in alignment effects and in impact parameter resolution, and they account for differences between data and simulation.

The uncertainties in the \(\pi ^{0}\)  angular resolution and energy scale are estimated in situ in the analysis. The \(\pi ^{0}\)  energy and momentum direction are initially varied in accordance with the energy response of the calorimeter to charged pions [77] and the measured \(\pi ^{0}\)  angular resolution [72], respectively. These variations in the \(\pi ^{0}\)  energy scale and angular resolution are found to affect the shape of the distribution of the reconstructed invariant mass of the \(\pi ^{\pm }\pi ^{0}\) system, \(m(\pi ^{\pm }\), \(\pi ^{0})\), in \(\tau ^{\pm }\rightarrow \rho ^{\pm } \nu \rightarrow \pi ^{\pm }\pi ^{0}\nu \) decays. Dedicated \(Z \rightarrow \tau \tau \)  control regions with exclusive 1p1n decays, namely the \(\ell \)–1p1n and 1p1n–1p1n events, are defined in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  decay channels, respectively, as described in Sect. 5.2. The \(m(\pi ^{\pm }\), \(\pi ^{0})\) distribution is used as the observable in these control regions in the combined fit, so that the final size of the \(\pi ^{0}\)  angular resolution and energy scale uncertainties in the \(\phi _{\tau }\)measurement are determined from data. Figure 3 shows the post-fit distributions of \(m(\pi ^{\pm },\pi ^{0})\) in the \(Z\rightarrow \tau \tau \) control regions in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels using \(\ell \)–1p1n and 1p1n–1p1n events, respectively. For the 1p1n–1p1n events in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel, only the leading \(\tau _{\text {had}}\,\,\) is selected for the \(m(\pi ^{\pm },\pi ^{0})\) distribution. The \(m(\pi ^{\pm },\pi ^{0})\) data distributions show good agreement with the prediction.

Fig. 3
figure 3

Post-fit distributions of the \(\pi ^{\pm }\pi ^{0}\) invariant mass, \(m(\pi ^{\pm },\pi ^{0})\), in the \(Z\rightarrow \tau \tau \) control regions in the a \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and b \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels. The \(\ell \)–1p1n (1p1n–1p1n) events are used in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  (\(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  ) channel in this control region. For the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channel only the \(m(\pi ^{\pm },\pi ^{0})\) value of the leading \(\tau _{\text {had}}\,\,\) is selected. ‘Other backgrounds’ include W, diboson, top, \(Z\rightarrow \ell \ell \) and \(H\rightarrow WW^*\). The hatched uncertainty band includes all sources of uncertainty after the fit to data

7 Statistical analysis

To measure the \(\textit{CP}\)-mixing angle \(\phi _{\tau }\), a simultaneous fit to the data is performed using a likelihood function that depends on the \(\textit{CP}\)-mixing angle \(\phi _{\tau }\)as the parameter-of-interest, and the nuisance parameters that account for the systematic uncertainties and the floating normalisations for the Higgs boson signal and for the background. The likelihood function is constructed as a product of Poisson probability terms over the bins of the input distributions, and the parameter-of-interest is estimated by maximising the likelihood. The likelihood comprises 24 signal regions and 10 control regions. Constraints on the nuisance parameters are assigned with a Gaussian term, and bin-by-bin statistical fluctuations in the simulated background samples are included in the fit with a Poisson probability term.

In the fit the normalisation of the \(Z \rightarrow \tau \tau \)  events is left to float freely in the eight control regions (described in Sect. 5.2) to account for the \(Z \rightarrow \tau \tau \)  modelling in the different signal regions. Four normalisation factors (NF) are defined to control the \(Z \rightarrow \tau \tau \)  background events in the Boost_0, Boost_1, VBF_0, VBF_1 categories, respectively. Each factor is shared between the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels, and is also shared between a control region and the corresponding signal region in the same category. Other backgrounds are normalised to their expected number of events estimated from the MC simulation.

Two control regions using events with a \(\tau ^{\pm }\rightarrow \rho ^{\pm } \nu \rightarrow \pi ^{\pm }\pi ^{0}\nu \) decay (\(\ell \)–1p1n in \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and 1p1n–1p1n in \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  ) are defined from the \(Z \rightarrow \tau \tau \)  control region events (Sect. 5.2). The distribution of the \(\pi ^{\pm }\pi ^{0}\) invariant mass, \(m(\pi ^{\pm }\), \(\pi ^{0})\), is employed in these control regions to control the \(\pi ^0\)-related uncertainties by using the data, as described in Sect. 6.

The Higgs boson signal strength \(\mu _{\tau \tau }\) (defined as the ratio of the measured signal yield to the SM expectation) is also left unconstrained in the fit, such that the signal normalisation does not depend on the SM assumption and only the shape of the \(\varphi ^{*}_{ CP }\)  distribution is exploited in the estimation of \(\phi _{\tau }\). Model-dependence of the cross-section on \(\textit{CP}\)-mixing scenarios is not exploited. The \(\varphi ^{*}_{ CP }\)  distributions in the signal regions are binned to maximise the measurement sensitivity, taking into account the associated uncertainties. In general, finer binnings are used in the signal regions associated with higher sensitivity (High and Medium SRs), with coarser binning in the Low signal region. A smoothing procedure is applied in the signal regions to remove potentially large local fluctuations in the systematic variations of the \(\varphi ^{*}_{ CP }\)  distributions because of the limited size of the MC samples used to build the templates. The test statistic is based on a profile likelihood ratio and the asymptotic approximation [78] is used for statistical interpretations.

8 Results

Figure 4 shows the post-fit \(\varphi ^{*}_{ CP }\)  distributions for the data as well as the prediction in the High, Medium and Low signal regions in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels, respectively. In each distribution, the \(\varphi ^{*}_{ CP }\)  bins are counted incrementally through all VBF and Boost categories and cover the range [0, 360]\(^{\circ }\) for each category. The observed and expected negative log-likelihood (\(\Delta \ln L\)) scans in \(\phi _{\tau }\)are shown in Fig. 5. The observed (expected) value of \(\phi _{\tau }\) is \(9^{\circ } \pm 16^{\circ }\) (\(0^{\circ } \pm 28 ^{\circ }\)) at the 68% confidence level (CL), and \(\pm 34^{\circ }\) (\(_{-70^{\circ }}^{+75^{\circ }}\)) at the 2\(\sigma \) level. The data disfavours the pure \(\textit{CP}\)-odd hypothesis at the 3.4\(\sigma \) level, while the expected exclusion level is 2.1\(\sigma \). The fitted signal and background normalisations are shown in Table 6. The results are compatible with the SM expectation within the measured uncertainties.

The total uncertainty is dominated by the statistical uncertainties of the data sample. The dominant contributions to the systematic uncertainties are from jets, followed by limited sample size of the background simulations, uncertainties from the free-floating normalisation factors, and theory uncertainties. The uncertainties in the \(\tau \)-lepton decay reconstruction have small impacts of less than 1\(^{\circ }\) on \(\phi _{\tau }\). The effects from the \(\pi ^{0}\) uncertainties measured in situ are compared with those from a set of simulated event samples with systematically varied modelling of the hadronic response in the calorimeter, and the latter are found to have a similar impact on the \(\phi _{\tau }\)measurement. Effects from other sources are negligible. The impact of the uncertainties is summarised in Table 7.

The difference between the observed and expected sensitivities of the \(\phi _{\tau }\)measurement can be attributed to a statistical fluctuation in data. The uncertainties of the \(\phi _{\tau }\)measurement are highly dependent on the size of the modulation amplitude of the \(\varphi ^{*}_{ CP }\)  distributions. In the measured data, there are more distortions of the \(\varphi ^{*}_{ CP }\)  shape due to bins fluctuating from their expected values, resulting in overall larger modulation amplitudes than the predicted shape. A set of pseudo-experiments is performed (with the condition \(\mu _{\tau \tau } = 1\)) that shows that the probability of obtaining the observed distribution, given the expectation, is about 4%.

The present measurement is compatible with the recent measurement of the same mixing-angle parameter by the CMS Collaboration [24], for which an observed (expected) mixing-angle value of \(-1^{\circ }\pm 19^{\circ }\) (\(0^{\circ }\pm 21^{\circ }\)) at the 68% CL was reported.

The expected sensitivities of the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)   channels in excluding a pure \(\textit{CP}\)-odd \(H\tau \tau \) coupling are 1.7\(\sigma \) and 1.1\(\sigma \), respectively. The ‘High’ signal regions contribute the most, with 1.4\(\sigma \) and 1.0\(\sigma \) in the \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  channels, respectively. Other signal regions have expected sensitivities below 1\(\sigma \).

A 2D scan of \(\Delta \ln L\) as a function of the signal strength \(\mu _{\tau \tau }\) versus \(\phi _{\tau }\)is shown in Fig. 6. The 1\(\sigma \) and 2\(\sigma \) 2D confidence levels correspond to \(\Delta \ln L\) values of 1.15 and 3.09, respectively. No strong correlation between \(\mu _{\tau \tau }\) and \(\phi _{\tau }\) is observed. The SM prediction of \(\mu _{\tau \tau } = 1\) and \(\phi _{\tau }= 0\) is compatible with the measurement within the 1\(\sigma \) confidence region.

Fig. 4
figure 4

Post-fit distributions of \(\varphi ^{*}_{ CP }\)  in the signal regions (SRs), showing a \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  High SR, b \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  High SR, c \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  Medium SR, d \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  Medium SR, e \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  Low SR, and f \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  Low SR. The \(\varphi ^{*}_{ CP }\)  bins are counted incrementally through all VBF and Boost categories and cover the range [0, 360]\(^{\circ }\) for each category. The best-fit \(H \rightarrow \tau \tau \)  signal is shown in solid pink, while the red and green lines indicate the predictions for the pure \(\textit{CP}\)-even (scalar, SM) and pure CP-odd (pseudoscalar) hypotheses, respectively, scaled to the predicted signal yield. ‘Other backgrounds’ include W, diboson, top, \(Z\rightarrow \ell \ell \) and \(H\rightarrow WW^*\). The hatched uncertainty band includes all sources of uncertainty after the fit to data

Figure 7 shows the data distribution of \(\varphi ^{*}_{ CP }\)  with all signal regions in both the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels combined, together with the best-fit \(H \rightarrow \tau \tau \)   signal, pure \(\textit{CP}\)-even and pure \(\textit{CP}\)-odd hypotheses. The events in each signal region are weighted with \(\ln (1+S/B)\), where S and B are the event yields for the signal and the total background, respectively. The background is subtracted from data in the figure. The distribution illustrates that the data disfavours the pure \(\textit{CP}\)-odd scenario.

Fig. 5
figure 5

One-dimensional likelihood scan of the \(\textit{CP}\)-mixing angle \(\phi _{\tau }\). The observed (expected) value of \(\phi _{\tau }\) is \(9^{\circ }\pm 16^{\circ }\) (\(0^{\circ }\pm 28^{\circ }\)) at the 68% confidence level (CL), and \(\pm 34^{\circ }\) (\(_{-70^{\circ }}^{+75^{\circ }}\)) at the 2\(\sigma \) level. The \(\textit{CP}\)-odd hypothesis is rejected at the 3.4\(\sigma \) (2.1\(\sigma \) expected) level

Table 6 Free-floating parameters in the measurement. Observed and expected values are shown for the \(\textit{CP}\)-mixing angle (\(\phi _{\tau }\)), the signal strength (\(\mu _{\tau \tau }\)) and various background normalisations for \(Z \rightarrow \tau \tau \)  (NF\(_{Z \rightarrow \tau \tau \,\,}\)) corresponding to different signal phase-space regions
Table 7 Impact of different sources of uncertainty on the \(\phi _{\tau }\) measurement
Fig. 6
figure 6

A 2D likelihood scan of the observed signal strength \(\mu _{\tau \tau }\) versus the \(\textit{CP}\)-mixing angle \(\phi _{\tau }\). The 1\(\sigma \) and 2\(\sigma \) confidence regions are shown

Fig. 7
figure 7

Combined post-fit distribution of \(\varphi ^{*}_{ CP }\)  from all signal regions in both the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  channels. Events are weighted with \(\ln (1+S/B)\) for the corresponding signal region. The background is subtracted from data. The best-fit \(H \rightarrow \tau \tau \)  signal is shown in solid pink, while the red and green lines indicate the predictions for the pure \(\textit{CP}\)-even (scalar, SM) and pure CP-odd (pseudoscalar) hypotheses, respectively, all scaled to the best-fit \(H \rightarrow \tau \tau \)  signal yield. The hatched uncertainty band includes all sources of uncertainty after the fit to data, and represents the same uncertainty in the total signal and background predictions as in Fig. 4

9 Conclusion

A measurement of the \(\textit{CP}\) properties of the interaction between the Higgs boson and \(\tau \)-leptons is presented. In the generalised Yukawa interaction, a single mixing parameter \(\phi _{\tau }\)parameterises \(\textit{CP}\)-violating interactions between the Higgs boson and \(\tau \)-leptons. The measurement is performed using 139 fb\(^{-1}\) of proton–proton collision data recorded at a centre-of-mass energy of \(\sqrt{s} = 13\) TeV with the ATLAS detector at the Large Hadron Collider. The measurement is based on a maximum-likelihood fit to the \(\textit{CP}\)-sensitive angular observable \(\varphi ^{*}_{ CP }\)  defined by the visible decay products in the \(\tau _{\text {lep}}\)  \(\tau _{\text {had}}\)  and \(\tau _{\text {had}}\)  \(\tau _{\text {had}}\)  decay channels. Depending on the decay channels, different methods are used to reconstruct the \(\varphi ^{*}_{ CP }\)  distributions. The sensitivity is optimised by applying decay-mode-dependent kinematic selections. The observed (expected) value of \(\phi _{\tau }\)is \(9^{\circ } \pm 16^{\circ }\) (\(0^{\circ } \pm 28^{\circ }\)) at the 68% confidence level, and \(\pm 34^{\circ }\) (\(_{-70^{\circ }}^{+75^{\circ }}\)) at the 2\(\sigma \) level. The pure \(\textit{CP}\)-odd hypothesis is disfavoured at a level of 3.4 standard deviations. The analysis precision is limited by the statistical uncertainty of the data sample. The measurement is consistent with the Standard Model expectation.