BasisGen: automatic generation of operator bases
Abstract
BasisGen is a Python package for the automatic generation of bases of operators in effective field theories. It accepts any semisimple symmetry group and fields in any of its finite dimensional irreducible representations. It takes into account integration by parts redundancy and, optionally, the use of equations of motion. The implementation is based on wellknown methods to generate and decompose representations using roots and weights, which allow for fast calculations, even with large numbers of fields and highdimensional operators. BasisGen can also be used to do some representationtheoretic operations, such as finding the weight system of an irreducible representation from its highest weight or decomposing a tensor product of representations.
1 Introduction
Effective field theory is a widely used framework for parameterizing the physics of systems whose degrees of freedom and symmetries are known. An effective Lagrangian is a linear combination of all local operators that can constructed with the fields in the theory, with the restriction that they are invariant under the action of the symmetry group. Usually, there is some other constraint that reduces the number of possibilities to a finite one, such as imposing a maximum canonical dimension. In this context, it is often convenient to obtain a complete set of independent operators, which is called a basis. BasisGen automatizes this task.
The input data needed for this calculation are the symmetry group G of the theory and the representation of G corresponding to each field. Once they are specified, one can obtain, for every monomial in the fields, the number of independent ways of forming an invariant under the action of G out of it. It must also be taken into account that total derivative terms can be added to the Lagrangian without changing the physics (except for effects of surface terms in the action). This means that some operators with derivatives can be rewritten in terms of others. Moreover, at each order in the effective Lagrangian, the addition of an operator proportional to the equations of motion does not change the S matrix up to higher order effects [1, 2, 3, 4, 5]. It follows that the equations of motion can be used, for example, to obtain a basis in which all the operators proportional to the functional derivative of the kinetic term have been removed [6, 7, 8, 9, 10]. For the Standard Model Effective Field Theory (SMEFT) (see Ref. [11] for a review), several bases and (incomplete) sets of independent operators have been computed taking all these facts into account [12, 13, 14, 15]. Computer tools can be used to translate from one basis to another [16, 17, 18, 19].
In the last few years, many developments have been made in the automatization of the generation of operator bases. Hilbert series methods provide an elegant way to compute invariants [20, 21, 22, 23, 24]. They can be directly implemented in a computer system with symbolic capabilities, as done for the SMEFT case in the auxiliary Mathematica notebook of Ref. [23]. One possible drawback of this approach, when used in computer code, is its performance, as an overhead due to the symbolic nature of the calculations might be introduced. The program DEFT [19], written in Python, uses a different approach to check and generate bases of operators for the SMEFT. The operators are not only counted but they are given explicitly, including their index contraction structure and the fields to which the derivatives are applied (see Ref. [25] for a nonautomatic calculation of the explicit operators in a basis). Additionally, it can perform changes of bases. The method it implements can be generalized to theories with a symmetry group given by a product of unitary groups.
BasisGen uses yet another approach, which is valid for any semisimple symmetry group and avoids the need for symbolic calculations. The algorithms that it uses to deal with representations of semisimple Lie algebras are the classical ones, based on weight vectors. They are reviewed, for example, in Ref. [26], and implemented in several computer packages with different purposes [27, 28, 29, 30, 31, 32]. To remove integration by parts redundancy, an adaptation of the method in Ref. [24] is used. BasisGen is \(\sim 150\) times faster than the implementation in the auxiliary notebook of ref. [23]. For example, BasisGen takes 3 s. to compute the 84 dimension6 operators of the 1generation SMEFT (in a laptop with a 2.6 GHz Intel Core i5 processor), while the notebook of Ref. [23] takes 7 min. DEFT also takes minutes for the calculation of a dimension6 basis of the 1generation SMEFT (according to Ref. [19]), although it must be taken into account that it does more work, as the concrete operators are given instead of just being counted.
For computations with effective field theories, BasisGen assumes 4dimensional Lorentz invariance. In addition, an internal symmetry group must be specified. This is, in general, the product of the global symmetry group and the gauge group. Derivatives are assumed to be gaugecovariant derivatives, so that the derivative of any field has the same representation under the internal symmetry group as the field itself. The gauge field strengths to be included in a calculation should be provided by the user. The fields must belong to linear irreducible representations of both the Lorentz group and the internal symmetry group. Finally, it is required that a power counting based on canonical dimensions can be used.
In this context, BasisGen generates bases of invariant operators. It gives the number of independent invariants that can be formed with each possible field content for an operator. Sets of all covariant operators, with their corresponding irreducible representations (irreps), can also be computed. The basic representationtheoretic functionalities needed for these calculations are: obtaining weight systems of irreps and decomposing their tensor products. An interface for their direct use is provided.
Although BasisGen does not provide the explicit index contraction structure of the operators in the basis, the functionality of decomposing tensor products can be used to help in their construction. For a particular field content, one can take the tensor product of the first two fields. Then, for each irrep in the decomposition, take the tensor product with the next field. This process can be iterated, keeping track of the intermediate irreps. In the end, one can obtain all the possible ways of doing the products of the fields that give an invariant. Nevertheless, some extra information (the corresponding Clebsch–Gordan coefficients) is needed to completely determine the operator.
BasisGen can be installed using pip by doing: pip install basisgen. It requires Python version 3.5 or higher. Its code can be downloaded from the GitHub repository https://github.com/jccriado/basisgen, where some examples of usage can be found. A simple script using BasisGen is presented in Listing 1. It defines an effective theory with internal symmetry group \(SU(2) \times U(1)\) for a complex scalar SU(2)doublet field with charge 1/2. It computes a basis of operators of dimension 8 or less. The output is presented in Listing 2. Each line gives the number of independent invariant operators that can be constructed with each field content.
2 Implementation
2.1 Basic operations with representations
In this section, the methods implemented in BasisGen to deal representations of semisimple Lie algebras are presented. A representation of a semisimple algebra is just a tensor product of representations of the algebra’s simple ideals. Using this fact, BasisGen decomposes calculations with semisimple algebras into smaller ones with simple algebras. The basic operations with representations of simple algebras are: the generation of the weight system of an irrep from its highest weight and the decomposition of a reducible representation into a direct sum of irreps. They are both implemented using wellknown methods (see Refs. [26, 27, 28, 29, 30, 31, 32]), which are summarized here, for completeness.
 1.
Set \(W = \{\}\) and \(W_{\text {new}} = \{\Lambda \}\).
 2.
Choose some \(\lambda \in W_{\text {new}}\).
 3.
For each positive component \(\lambda _i > 0\), select the ith row \(\alpha \) of the Cartan matrix. Append to \(W_{\text {new}}\) all weights of the form \(\lambda  k \alpha \), with \(0 < k \le \lambda _i\).
 4.
Remove \(\lambda \) from \(W_{\text {new}}\). Append it to W.
 5.
If \(W_{\text {new}}\) is empty, terminate. Otherwise, go to step 2.
The algorithm for the decomposition of a reducible representation as a direct sum of irreps is straightforward: from the collection of weights of the representation in question, find the highest and remove from the collection all the weights in the corresponding irrep. Repeat until the collection is empty. Then, the successive highest weights that were found in the process are the highest weights of the irreps in the decomposition. A direct application of this functionality is to decompose the tensor product of irreps. Let \(W_1\) and \(W_2\) be the weight systems of two representations \(R_1\) and \(R_2\). The weight system W of \(R_1 \otimes R_2\) is the collection of all \(\lambda _1 + \lambda _2\) for \((\lambda _1, \lambda _2) \in W_1 \times W_2\). Once W is constructed, it can be decomposed using the general decomposition algorithm.
In some cases, the symmetric or antisymmetric tensor power of some representation is needed. If \(W = {\{\lambda _i\}}_{i \in \{1, \ldots , n\}}\) is the weight system of some representation R, the weight system of the symmetric tensor power \({\text {Sym}}^k(R)\) is the collection of weights computed as \(\lambda _1 + \cdots + \lambda _k\) for every ktuple \((\lambda _{i_1}, \ldots , \lambda _{i_k})\) where \(i_1 \le \cdots \le i_k\). The weight system of the antisymmetric power \({\Lambda }^k(R)\) is constructed in a similar way, but using all ktuples \((\lambda _{i_1}, \ldots , \lambda _{i_k})\) with \(i_1< \cdots < i_k\) instead.
2.2 Constructing invariants in effective theories

The semisimple Lie algebra \(\mathfrak {g}\) of G.
 A collection of fields \(\phi _1, \ldots , \phi _m\). Each \(\phi _i\) must be equipped with:

An irrep \(R^{(i)}_{\text {Lorentz}}\) of the Lorentz algebra \(\mathfrak {su}_2 \oplus \mathfrak {su}_2\).

An irrep \(R^{(i)}_{\text {internal}}\) of \(\mathfrak {g}\).

A tuple \(\left( c^{(i)}_1, \ldots , c^{(i)}_n\right) \) of charges under the U(1) factors.

The statistics \(S_i\). Either boson or fermion.

A positive real number \(d_i\), specifying the canonical dimension of the field.

To take into account (covariant) derivatives, the same procedure is used, but now including the fields \(D_\mu \phi _i\), \(\{D_\mu , D_\nu \} \phi _i\), etc. Antisymmetric combinations of derivatives are automatically discarded, as they are equivalent to field strength tensors. Optionally, the equations of motion of the fields can be applied. This means that, for each \(D_{\mu _1}\cdots D_{\mu _m}\phi _i\), only the totally symmetric representation is retained (see Ref. [22]).
 1.
Set \(R = \{\}\).
 2.
Take one operator \(\mathcal {O}\) from the nonempty \(I_n\) with lowest n.
 3.
Remove \(\mathcal {O}\) from \(I_n\) and append it to R.
 4.
Compute the decomposition into irreps of \(D_\mu \mathcal {O}\) and eliminate the corresponding operators from \(I_{n+1}\). Compute the decomposition of \(\{D_\mu , D_\nu \} \mathcal {O}\) and remove it from \(I_{n+2}\). Continue until the maximum dimension is reached.
 5.
If all \(I_k\) are empty, terminate. Otherwise, go to step 2.
3 Interface
3.1 Basic objects
3.1.1 Functions

algebra Creates a (semi)simple Lie algebra from one string argument. The returned object is of the class SimpleAlgebra or SemisimpleAlgebra from the module algebra.
Examples of arguments: ’A3’, ’C12’, ’F4’, ’SU3’, ’B2+E7’, ’SU5 x SO6 x Sp10’.

irrep Creates an irreducible representation from 2 string arguments: the first represents the algebra and the second the highest weight.^{1} The returned object is of the class representations.Irrep.
Example: irrep(’SU4 x Sp7’, ’1 0 1 0 2 1’).
The weight system of a representations.Irrep object can be obtained by calling its weights_view method. Irreps with the same algebra can be multiplied to get the decomposition of their tensor product. Any two irreps can be added to give an irrep of the direct sum of their algebras.
Examples, showing the weights of the octet irrep of SU(3) (which has highest weight (11)) and the decomposition of the product of a triplet (10) and an antitriplet (01) as an octet plus a singlet:
3.1.2 Classes
Arguments of the Field constructor
Name  Description  Default 

name  String identifier  
lorentz_irrep  Lorentz group irrep  
internal_irrep  Irrep of the internal (semisimple) symmetry group  
charges  Charges under an arbitrary number of U(1) factors  [] 
statistics  Either boson or fermion  boson 
dimension  Canonical dimension of the field  1 
number_of_flavors  Number of different copies of the same field  1 

Field Has an attribute conjugate, the conjugate field. The constructor arguments are presented in Table 1.
 EFT Constructor arguments:Methods:

internal_algebra The semisimple Lie algebra of the internal symmetry group.

fields A list of Field objects representing the field content of the theory.

invariants Returns a basis of operators, encapsulated in an EFT.Invariants object. These can be directly printed (implement __str__). They have a method count to calculate the total number of operators in the basis, and a method show_by_classes, which returns a simplified string representation of the basis, provided a dictionary whose keys are the fields and values are strings representing classes of fields.

covariants Returns a collection of all operators with all possible irreps, in the form of a EFT. Covariants instance. Its only purpose is to hold the information until it is printed (implements __str__).

3.1.3 Other
The following irreps of the Lorentz group have been defined, for ease of use: scalar, L_spinor, R_spinor, vector, L_tensor, R_tensor. L_spinor and R_spinor correspond to left and right Weyl spinors, respectively. L_tensor and R_tensor correspond to the left and right parts of an antisymmetric tensor with two indices.
The statistics of a field can be specified by using the variables boson and fermion, which are set to the values BOSON and FERMION of the enum class Statistics from the module statistics.
3.2 The smeft module

The Higgs doublet phi and its conjugate phic.

The left and right parts GL and GR of the SU(3) field strength.

The left and right parts WL and WR of the SU(2) field strength

The left and right parts BL and BR of the U(1) field strength.

The quark doublet Q and its conjugate Qc.

The lepton doublet L and its conjugate Lc.

The uptype quark singlet u and its conjugate uc.

The downtype quark singlet d and its conjugate dc.

The electron singlet e and its conjugate ec.
4 Conclusions
BasisGen computes bases of operators for effective field theories in a general setting: the internal symmetry group can be any product of a semisimple group and an arbitrary number of U(1) factors. 4dimensional Lorentz invariance is assumed to provide support for concrete applications, although adaptations to other spacetime dimensions can be easily made, due to the generality of the core functionalities.
The decision of using the equations of motion is left to the user, as it may be convenient to work with redundant bases in some cases (see Ref. [5]). It is also possible not only to compute invariants but to generate all covariant operators, classified by their irreps. This can be useful, for example, to find the representation of fields that couple linearly to an already known theory, which are often the most relevant ones for phenomenology [33, 34, 35, 36, 37]. An interface for doing basic operations with representations of semisimple groups is also provided.
BasisGen’s speed for large numbers of fields and highdimensional operators makes it possible to calculate bases for the SMEFT or for other effective theories for physics beyond the Standard Model, in times ranging from seconds (for the dimension8 operators in the SMEFT) to minutes (for higherdimensional operators or larger number of fields) in personal computers.
Footnotes
Notes
Acknowledgements
The author would like to thank M. PérezVictoria for useful discussions and comments. This work has been supported by the Spanish MINECO project FPA201678220C31P (Fondos FEDER), the Junta de Andalucía Grant FQM101 and the Spanish MECD Grant FPU14.
References
 1.H.D. Politzer, Nucl. Phys. B 172, 349 (1980). https://doi.org/10.1016/05503213(80)901728 ADSMathSciNetCrossRefGoogle Scholar
 2.C. GrosseKnetter, Phys. Rev. D 49, 6709 (1994). https://doi.org/10.1103/PhysRevD.49.6709 ADSMathSciNetCrossRefGoogle Scholar
 3.C. Arzt, Phys. Lett. B 342, 189 (1995). https://doi.org/10.1016/03702693(94)01419D ADSCrossRefGoogle Scholar
 4.J. Wudka, Int. J. Mod. Phys. A 9, 2301 (1994). https://doi.org/10.1142/S0217751X94000959 ADSCrossRefGoogle Scholar
 5.J.C. Criado, M. PérezVictoria, JHEP 3, 38 (2019). https://doi.org/10.1007/JHEP03(2019)038 ADSCrossRefGoogle Scholar
 6.H. Georgi, Nucl. Phys. B 361, 339 (1991). https://doi.org/10.1016/05503213(91)90244R ADSCrossRefGoogle Scholar
 7.B. Grzadkowski, Z. Hioki, K. Ohkuma, J. Wudka, Nucl. Phys. B 689, 108 (2004). https://doi.org/10.1016/j.nuclphysb.2004.04.006 ADSCrossRefGoogle Scholar
 8.P.J. Fox, Z. Ligeti, M. Papucci, G. Perez, M.D. Schwartz, Phys. Rev. D 78, 054008 (2008). https://doi.org/10.1103/PhysRevD.78.054008 ADSCrossRefGoogle Scholar
 9.J.A. AguilarSaavedra, Nucl. Phys. B 812, 181 (2009). https://doi.org/10.1016/j.nuclphysb.2008.12.012 ADSCrossRefGoogle Scholar
 10.J.A. AguilarSaavedra, Nucl. Phys. B 821, 215 (2009). https://doi.org/10.1016/j.nuclphysb.2009.06.022 ADSCrossRefGoogle Scholar
 11.I. Brivio, M. Trott, Phys. Rept. 793, 1 (2019). https://doi.org/10.1016/j.physrep.2018.11.002 ADSCrossRefGoogle Scholar
 12.K. Hagiwara, S. Ishihara, R. Szalapski, D. Zeppenfeld, Phys. Rev. D 48, 2182 (1993). https://doi.org/10.1103/PhysRevD.48.2182 ADSCrossRefGoogle Scholar
 13.G.F. Giudice, C. Grojean, A. Pomarol, R. Rattazzi, JHEP 06, 045 (2007). https://doi.org/10.1088/11266708/2007/06/045 ADSCrossRefGoogle Scholar
 14.B. Grzadkowski, M. Iskrzynski, M. Misiak, J. Rosiek, JHEP 10, 085 (2010). https://doi.org/10.1007/JHEP10(2010)085 ADSCrossRefGoogle Scholar
 15.J. EliasMiró, C. Grojean, R.S. Gupta, D. Marzocca, JHEP 05, 019 (2014). https://doi.org/10.1007/JHEP05(2014)019 ADSCrossRefGoogle Scholar
 16.A. Falkowski, B. Fuks, K. Mawatari, K. Mimasu, F. Riva, V. Sanz, Eur. Phys. J. C 75(12), 583 (2015). https://doi.org/10.1140/epjc/s100520153806x ADSCrossRefGoogle Scholar
 17.J.C. Criado, Comput. Phys. Commun. 227, 42 (2018). https://doi.org/10.1016/j.cpc.2018.02.016 ADSCrossRefGoogle Scholar
 18.J. Aebischer, J. Kumar, D.M. Straub, Eur. Phys. J. C 78(12), 1026 (2018). https://doi.org/10.1140/epjc/s1005201864927 ADSCrossRefGoogle Scholar
 19.B. Gripaios, D. Sutherland, JHEP 1, 128 (2019). https://doi.org/10.1007/JHEP01(2019)128 ADSCrossRefGoogle Scholar
 20.L. Lehman, A. Martin, Phys. Rev. D 91, 105014 (2015). https://doi.org/10.1103/PhysRevD.91.105014 ADSMathSciNetCrossRefGoogle Scholar
 21.B. Henning, X. Lu, T. Melia, H. Murayama, Commun. Math. Phys. 347(2), 363 (2016). https://doi.org/10.1007/s0022001525182 ADSCrossRefGoogle Scholar
 22.L. Lehman, A. Martin, JHEP 02, 081 (2016). https://doi.org/10.1007/JHEP02(2016)081 ADSCrossRefGoogle Scholar
 23.B. Henning, X. Lu, T. Melia, H. Murayama, JHEP 08, 016 (2017). https://doi.org/10.1007/JHEP08(2017)016 ADSCrossRefGoogle Scholar
 24.B. Henning, X. Lu, T. Melia, H. Murayama, JHEP 10, 199 (2017). https://doi.org/10.1007/JHEP10(2017)199 ADSCrossRefGoogle Scholar
 25.C. Hays, A. Martin, V. Sanz, J. Setford, JHEP 2, 123 (2019). https://doi.org/10.1007/JHEP02(2019)123 ADSCrossRefGoogle Scholar
 26.R. Slansky, Phys. Rep. 79, 1 (1981). https://doi.org/10.1016/03701573(81)900922 ADSMathSciNetCrossRefGoogle Scholar
 27.M.A.A. van Leeuwen, A.M. Cohen, B. Lisser, Computer Algebra Nederland (Amsterdam, 1992) (ISBN 9074116027) Google Scholar
 28.A. Candiello, Comput. Phys. Commun. 81, 248 (1994). https://doi.org/10.1016/00104655(94)901236 ADSMathSciNetCrossRefGoogle Scholar
 29.T. Fischbacher (2002). arXiv:hepth/0208218
 30.C. Horst, J. Reuter, Comput. Phys. Commun. 182, 1543 (2011). https://doi.org/10.1016/j.cpc.2011.03.025 ADSCrossRefGoogle Scholar
 31.A. Nazarov, Comput. Phys. Commun. 183, 2480 (2012). https://doi.org/10.1016/j.cpc.2012.06.014 ADSCrossRefGoogle Scholar
 32.R. Feger, T.W. Kephart, Comput. Phys. Commun. 192, 166 (2015). https://doi.org/10.1016/j.cpc.2014.12.023 ADSMathSciNetCrossRefGoogle Scholar
 33.F. del Aguila, M. PerezVictoria, J. Santiago, JHEP 09, 011 (2000). https://doi.org/10.1088/11266708/2000/09/011 CrossRefGoogle Scholar
 34.F. del Aguila, J. de Blas, M. PerezVictoria, Phys. Rev. D 78, 013010 (2008). https://doi.org/10.1103/PhysRevD.78.013010 ADSCrossRefGoogle Scholar
 35.F. del Aguila, J. de Blas, M. PerezVictoria, JHEP 09, 033 (2010). https://doi.org/10.1007/JHEP09(2010)033 CrossRefGoogle Scholar
 36.J. de Blas, M. Chala, M. PerezVictoria, J. Santiago, JHEP 04, 078 (2015). https://doi.org/10.1007/JHEP04(2015)078 CrossRefGoogle Scholar
 37.J. de Blas, J.C. Criado, M. PerezVictoria, J. Santiago, JHEP 03, 109 (2018). https://doi.org/10.1007/JHEP03(2018)109 ADSCrossRefGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Funded by SCOAP^{3}