First of all, we thank all of you for your most valuable comments regarding our manuscript!

Professor Jiménez-Gamero, in particular, comments on the energy test in connection with the class of BHEP tests and on testing normality for non-observable data originating from i.i.d. data. We found it very interesting to see that the energy test statistic of Székely and Rizzo belongs to the class of BHEP statistics which, in spirit of results of Baringhaus et al. (2017), shows that the energy test statistic, after a suitable affine transformation, has a normal limit under fixed alternatives. Such a result answers a question that we posed in our review. We also thank Prof. Jiménez-Gamero for pointing out results and open problems in connection with testing for normality for the distribution of errors in linear models or innovations in MGARCH models. In these cases, tests are based on residuals, which are not i.i.d. The authors will use the input of Professor Jiménez-Gamero regarding these models to update the description of the accompanying R package mnt, see Betsch and Ebner (2020).

Professor Meintanis draws attention to the issue of computation time, which is becoming more and more important in view of nearly ubiquitously present high-dimensional data. He points out that, in order to obtain a representation of a weighted \(L^2\)-statistic that is free of integrals, when testing the goodness of fit of an \(\alpha \)-stable distribution, the weight function should be a suitable spherical Kotz-type density. The underlying concept is that of adjoint distributions, where the density of one distribution is a multiple of the characteristic function of the other distribution.

Professor Richards raises a variety of intriguing research questions. These refer to tests for normality in the case of incomplete data (notably an extension of the results of Yamada et al. (2015) to obtain kurtosis tests) and the option of having \(L^p\)-tests (where p could even be estimated from the data), but also to ascertain the value of recent kernel Stein discrepancy methods for \(L^2\)-tests for normality. Regarding the latter point, it is noteworthy that the recent test of Ebner et al. (2020) is motivated by a multivariate Stein equation and, as mentioned in that paper, such a motivation also holds for the test of multivariate normality in Henze and Visagie (2020). It is obvious that similar ideas could be used for a wide variety of parametric families of univariate and multivariate distributions, since Stein’s method of approximation has been extended to many other families of distributions, for characterizations of distributions related to this subject see Betsch and Ebner (2019) and the references therein.

Moreover, Professor Richards asks whether the integral representation of the BHEP statistic given at the beginning of Section 3 can be extended to non-normal distributions. Up to now, we do not have an answer to this interesting question.

As for the comment regarding a simple explanation on why the functional \(T_{\mathrm{CS}}\) vanishes on the set \(\mathcal {N}_d\), we consult Ebner (2012) to see that the test of Cox and Small was motivated by the wish to assess the occurrence of nonlinearity of dependence in the underlying population. Consequently, the test statistic uses the regression coefficient of the quadratic component in a suitably chosen regression model on linear combinations of the components of the random vectors. Since the normal distribution is uniquely determined by the mean vector and the covariance matrix, it merely displays linear relationships between the components of the normal distributed random vector, and hence, the functional vanishes. One can easily see that the latter property also holds certain non-normal distributions satisfying a weak moment condition, such as elliptically symmetric distributions.

The question of obtaining results on the behavior of the energy test with respect to contiguous alternatives seems to be solved in view of the observation of Professor Jiménez-Gamero that the energy statistic is a special case of the class of BHEP tests. Finally, there is the fundamentally important problem to obtain explicit expressions for the eigenvalues and the corresponding eigenfunctions associated with the Fredholm integral equations that pertain to weighted \(L^2\)-statistics. In this respect, Professor Richards makes a valuable list of some cases in which such explicit expressions have been obtained. The case of the BHEP statistic, however, is still open.

Professors Rizzo and Székely elaborate on a promising generalization of the energy test, which is based on a so-called \(\alpha \)-energy distance, where \(0< \alpha < 2\). At the end of their contribution, the authors raise the natural question whether there is an ’optimal’ way to choose the parameter \(\alpha \) in order maximize the power of the tests of normality. A first step in this direction would be an extensive simulation study, for which the accompanying R package mnt, see Betsch and Ebner (2020), may be of benefit. On a more theoretical level, the methods proposed by Tenreiro (2019) are noteworthy. Although they are formulated in the univariate setting, they can be straightforwardly extended to the multivariate case.