Genetic-algorithm-optimized neural networks for gravitational wave classification

Deighan, Dwyer S.; Field, Scott E.; Capano, Collin D.; Khanna, Gaurav

doi:10.1007/s00521-021-06024-4

Genetic-algorithm-optimized neural networks for gravitational wave classification

Original Article
Published: 24 April 2021

Volume 33, pages 13859–13883, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Dwyer S. Deighan¹,
Scott E. Field ORCID: orcid.org/0000-0002-6037-3277²,
Collin D. Capano³ &
…
Gaurav Khanna^4,5

455 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

Gravitational-wave detection strategies are based on a signal analysis technique known as matched filtering. Despite the success of matched filtering, due to its computational cost, there has been recent interest in developing deep convolutional neural networks (CNNs) for signal detection. Designing these networks remains a challenge as most procedures adopt a trial and error strategy to set the hyperparameter values. We propose a new method for hyperparameter optimization based on genetic algorithms (GAs). We compare six different GA variants and explore different choices for the GA-optimized fitness score. We show that the GA can discover high-quality architectures when the initial hyperparameter seed values are far from a good solution as well as refining already good networks. For example, when starting from the architecture proposed by George and Huerta, the network optimized over the 20-dimensional hyperparameter space has 78% fewer trainable parameters while obtaining an 11% increase in accuracy for our test problem. Using genetic algorithm optimization to refine an existing network should be especially useful if the problem context (e.g., statistical properties of the noise, signal model, etc) changes and one needs to rebuild a network. In all of our experiments, we find the GA discovers significantly less complicated networks as compared to the seed network, suggesting it can be used to prune wasteful network structures. While we have restricted our attention to CNN classifiers, our GA hyperparameter optimization strategy can be applied within other machine learning settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development and Application of Artificial Neural Network

Article 30 December 2017

Visualizing and Understanding Convolutional Networks

Fundamentals of Artificial Neural Networks and Deep Learning

Notes

For simplicity, we assume here that N is even. This can always be made to be the case, since the observation time and sampling rate are free parameters in an analysis.
We use the same convention for the Fourier transform as in Ref. [83].

References

Aasi J et al (2015) Advanced LIGO. Class Quantum Gravity 32:074001
Article Google Scholar
Accadia T et al (2012) Virgo: a laser interferometer to detect gravitational waves. JINST 7:P03012
Article Google Scholar
Abbott BP et al (2016) Observation of gravitational waves from a binary black hole merger. Phys Rev Lett 116(6):061102
Article MathSciNet Google Scholar
Abbott BP, Abbott R, Abbott T, Abernathy M, Acernese F, Ackley K, Adams C, Adams T, Addesso P, Adhikari R et al (2016) Binary black hole mergers in the first advanced LIGO observing run. Phys Rev X 6(4):041015
Google Scholar
Abbott BP et al (2018) GW170104: observation of a 50-solar-mass binary black hole coalescence at redshift 0.2. Phys Rev Lett 118(22), 221101 (Erratum: Phys Rev Lett 121(12), 129901)
Abbott BP et al (2017) GW170814: a three-detector observation of gravitational waves from a binary black hole coalescence. Phys Rev Lett 119(14):141101
Article Google Scholar
Abbott BP et al (2017) GW170608: observation of a 19-solar-mass binary black hole coalescence. Astrophys J 851(2):L35
Article Google Scholar
Abbott BP et al (2017) GW170817: observation of gravitational waves from a binary neutron star inspiral. Phys Rev Lett 119(16):161101
Article Google Scholar
Abbott BP et al (2019) GWTC-1: a gravitational-wave transient catalog of compact binary mergers observed by LIGO and virgo during the first and second observing runs. Phys Rev X 9(3):031040
Google Scholar
Abbott BP et al (2016) GW151226: observation of gravitational waves from a 22-solar-mass binary black hole coalescence. Phys Rev Lett 116(24):241103
Article Google Scholar
Abbott B, Abbott R, Abbott T, Abraham S, Acernese F, Ackley K, Adams C, Adhikari R, Adya V, Affeldt C et al (2019) Gwtc-1: a gravitational-wave transient catalog of compact binary mergers observed by LIGO and Virgo during the first and second observing runs. Phys Rev X 9(3):031040
Google Scholar
Abbott BP et al (2018) Prospects for observing and localizing gravitational-wave transients with advanced LIGO, advanced Virgo and KAGRA. Living Rev Relat 21(1):3
Article Google Scholar
Abbott B, Abbott R, Abbott T, Abraham S, Acernese F, Ackley K, Adams C, Adhikari RX, Adya V, Affeldt C et al (2019) Binary black hole population properties inferred from the first and second observing runs of advanced LIGO and advanced Virgo. Astrophys J Lett 882(2):L24
Article Google Scholar
Ligo/virgo public alerts. https://gracedb.ligo.org/superevents/public/O3/
Jaranowski P, Krolak A (2012) Gravitational-wave data analysis. Formalism and sample applications: the Gaussian case. Living Rev Relat 15:4
Article MATH Google Scholar
Turin G (1960) An introduction to matched filters. IRE Trans Inf Theory 6(3):311–329
Article MathSciNet Google Scholar
Harry I, Privitera S, Bohé A, Buonanno A (2016) Searching for gravitational waves from compact binaries with precessing spins. Phys Rev D 94(2):024012
Article Google Scholar
Messick C et al (2017) Analysis framework for the prompt discovery of compact binary mergers in gravitational-wave data. Phys Rev D 95:042001
Article Google Scholar
Chu Q (2017) Low-latency detection and localization of gravitational waves from compact binary coalescences. PhD thesis, University of Western Australia
Klimenko S et al (2016) Method for detection and reconstruction of gravitational wave transients with networks of advanced detectors. Phys Rev D 93:042004
Article Google Scholar
Adams T et al (2016) Low-latency analysis pipeline for compact binary coalescences in the advanced gravitational wave detector era. Class Quantum Gravity 33:175012
Article Google Scholar
Nitz A et al (2018) Rapid detection of gravitational waves from compact binary mergers with PyCBC Live. Phys Rev D 98:024050
Article Google Scholar
George D, Huerta EA (2018) Deep neural networks to enable real-time multimessenger astrophysics. Phys Rev D 97:044039
Article Google Scholar
Shen H, Huerta E, and Zhao Z (2019) Deep learning at scale for gravitational wave parameter estimation of binary black hole mergers. arXiv preprint arXiv:1903.01998
Hezaveh YD, Levasseur LP, Marshall PJ (2017) Fast automated analysis of strong gravitational lenses with convolutional neural networks. Nature 548(7669):555
Article Google Scholar
Levasseur LP, Hezaveh YD, Wechsler RH (2017) Uncertainties in parameters estimated with neural networks: application to strong gravitational lensing. Astrophys J Lett 850(1):L7
Article Google Scholar
Ciuca R, Hernández OF, Wolman M (2019) A convolutional neural network for cosmic string detection in CMB temperature maps. Mon Not R Astron Soc 485(1):1377–1383
Article Google Scholar
Gabbard H, Williams M, Hayes F, Messenger C (2018) Matching matched filtering with deep networks for gravitational-wave astronomy. Phys Rev Lett 120(14):141103
Article Google Scholar
Shen H, George D, Huerta E, Zhao Z (2017) Denoising gravitational waves using deep learning with recurrent denoising autoencoders. arXiv preprint arXiv:1711.09919
George D, Shen H, Huerta E (2017) Glitch classification and clustering for LIGO with deep transfer learning. arXiv preprint arXiv:1711.07468
George D, Huerta E (2018) Deep learning for real-time gravitational wave detection and parameter estimation: results with advanced LIGO data. Phys Lett B 778:64–70
Article Google Scholar
Fort S (2017) Towards understanding feedback from supermassive black holes using convolutional neural networks. arXiv preprint arXiv:1712.00523
Gebhard TD, Kilbertus N, Harry I, Schölkopf B (2019) Convolutional neural networks: A magic bullet for gravitational-wave detection? Physical Review D 100(6)
Article Google Scholar
Shen H, George D, Huerta E, and Zhao Z (2017) Denoising gravitational waves using deep learning with recurrent denoising autoencoders. arXiv preprint arXiv, vol 1711
George D, Shen H, Huerta E (2018) Classification and unsupervised clustering of LIGO data with deep transfer learning. Phys Rev D 97(10):101501
Article Google Scholar
Bresten C, Jung J-H (2019) Detection of gravitational waves using topological data analysis and convolutional neural network: an improved approach. arXiv preprint arXiv:1910.08245
Lin Y-C, Wu J-HP (2020) Detection of gravitational waves using Bayesian neural networks. arXiv preprint arXiv:2007.04176
Krastev PG (2020) Real-time detection of gravitational waves from binary neutron stars using artificial neural networks. Phys Lett B 803:135330
Article MathSciNet Google Scholar
Schäfer MB, Ohme F, Nitz AH (2020)Detection of gravitational-wave signals from binary neutron star mergers using machine learning. arXiv preprint arXiv:2006.01509
Lin B-J, Li X-R, Yu W-L (2020) Binary neutron stars gravitational wave detection based on wavelet packet analysis and convolutional neural networks. Front Phys 15(2):24602
Article Google Scholar
Fan X, Li J, Li X, Zhong Y, Cao J (2019) Applying deep neural networks to the detection and space parameter estimation of compact binary coalescence with a network of gravitational wave detectors. Sci China Phys Mech Astron 62(6):1–8
Article Google Scholar
Chua AJ, Vallisneri M (2020) Learning Bayesian posteriors with neural networks for gravitational-wave inference. Phys Rev Lett 124(4):041102
Article Google Scholar
Gabbard H, Messenger C, Heng IS, Tonolini F, Murray-Smith R (2019) Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy. arXiv preprint arXiv:1909.06296
Green SR, Simpson C, Gair J (2020) Gravitational-wave parameter estimation with autoregressive neural network flows. arXiv preprint arXiv:2002.07656
Wei W, Huerta E (2020) Gravitational wave denoising of binary black hole mergers with deep learning. Phys Lett B 800:135081
Article MathSciNet Google Scholar
Khan A, Huerta E, Das A (2020) Physics-inspired deep learning to characterize the signal manifold of quasi-circular, spinning, non-precessing binary black hole mergers. Phys Lett B 808:135628
Article MathSciNet Google Scholar
ul Islam B, Baharudin Z, Raza MQ, Nallagownden P (2014) Optimization of neural network architecture using genetic algorithm for load forecasting. In: 2014 5th international conference on intelligent and advanced systems (ICIAS). IEEE, pp 1–6
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press
SageMaker. https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html
Hamdia KM, Zhuang X, Rabczuk T (2020) An efficient optimization approach for designing machine learning models based on genetic algorithm. Neural Comput Appl 33:1–11
Google Scholar
Normandin ME, Mohanty S, Weerathunga TS (2018) Particle swarm optimization based search for gravitational waves from compact binary coalescences: performance improvements. Phys Rev D 98:044029
Article Google Scholar
Abbott BP et al (2016) Astrophysical implications of the binary black-hole merger GW150914. Astrophys J 818(2):L22
Article Google Scholar
Maggiore M (2008) Gravitational waves, vol 1, 1st edn. Oxford University Press, New York
MATH Google Scholar
Owen BJ (1996) Search templates for gravitational waves from inspiraling binaries: choice of template spacing. Phys Rev D 53:6749–6761
Article Google Scholar
Brown D (2004) Searching for gravitational radiation from binary black hole MACHOs in the galactic halo. PhD thesis, University of Wisconsin–Milwaukee
Cutler C, Flanagan EE (1994) Gravitational waves from merging compact binaries: how accurately can one extract the binary’s parameters from the inspiral wave form? Phys Rev D 49:2658
Article Google Scholar
Romano JD, Cornish NJ (2017) Detection methods for stochastic gravitational-wave backgrounds: a unified treatment. Living Rev Relat 20(1):2
Article Google Scholar
Wainstein LA, Zubakov VD (1962) Extraction of signals from noise. Prentice-Hall, Englewood Cliffs
MATH Google Scholar
Allen B, Anderson WG, Brady PR, Brown DA, Creighton JD (2012) FINDCHIRP: an algorithm for detection of gravitational waves from inspiraling compact binaries. Phys Rev D 85:122006
Article Google Scholar
Newman ET, Penrose R (1966) Note on the Bondi–Metzner–Sachs group. J Math Phys 7:863–870
Article MathSciNet Google Scholar
Goldberg JN, Macfarlane AJ, Newman ET, Rohrlich F, Sudarshan ECG (1967) Spin-$s$ spherical harmonics and $\eth$. J Math Phys 8(11):2155–2161
Article MathSciNet MATH Google Scholar
Blackman J, Field SE, Galley CR, Szilágyi B, Scheel MA, Tiglio M, Hemberger DA (2015) Fast and accurate prediction of numerical relativity waveforms from binary black hole coalescences using surrogate models. Phys Rev Lett 115:121102
Article Google Scholar
Gwsurrogate. https://pypi.python.org/pypi/gwsurrogate/
Field SE, Galley CR, Hesthaven JS, Kaye J, Tiglio M (2014) Fast prediction and evaluation of gravitational waveforms using surrogate models. Phys Rev X 4:031006
Google Scholar
Neyman J, Pearson ES (1933) On the problem of the most efficient tests of statistical hypotheses. Philos Trans R Soc Lond A 231(694–706):289–337
MATH Google Scholar
Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366
Article MATH Google Scholar
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2(4):303–314
Article MathSciNet MATH Google Scholar
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization
Hoffer E, Hubara I, Soudry D (2017) Train longer, generalize better: closing the generalization gap in large batch training of neural networks. arXiv preprint arXiv:1705.08741
Smith SL, Kindermans P-J, Ying C, Le QV (2017) Don’t decay the learning rate, increase the batch size. arXiv preprint arXiv:1711.00489
Goyal P, Dollár P, Girshick R, Noordhuis P, Wesolowski L, Kyrola A, Tulloch A, Jia Y, He K (2017) Accurate, large minibatch SGD: training imagenet in 1 hour. arXiv preprint arXiv:1706.02677
Bäck T, Fogel DB, Michalewicz Z (2018) Evolutionary computation 1: basic algorithms and operators. CRC Press
Yin D, Kannan R, Bartlett P (2019) Rademacher complexity for adversarially robust generalization. In: International conference on machine learning, pp 7085–7094
Fortin F-A, De Rainville F-M, Gardner M-A, Parizeau M, Gagné C (2012) DEAP: evolutionary algorithms made easy. J Mach Learn Res 13:2171–2175
MathSciNet Google Scholar
Thangiah SR, Osman IH, Sun T (1994) Hybrid genetic algorithm, simulated annealing and tabu search methods for vehicle routing problems with time windows. Computer Science Department, Slippery Rock University, Technical report SRU CpSc-TR-94-27, vol 69
Gandomkar M, Vakilian M, Ehsan M (2005) A combination of genetic algorithm and simulated annealing for optimal dg allocation in distribution networks. In: Canadian conference on electrical and computer engineering, 2005. IEEE, pp 645–648
Park T, Ryu KR (2010) A dual-population genetic algorithm for adaptive diversity control. IEEE Trans Evol Comput 14(6):865–884
Article Google Scholar
Sharapov R, Lapshin A (2006) Convergence of genetic algorithms. Pattern Recognit Image Anal 16(3):392–397
Article Google Scholar
Eiben AE, Aarts EH, Van Hee KM (1990) Global convergence of genetic algorithms: a Markov chain analysis. In: International conference on parallel problem solving from nature. Springer, pp 3–12
Cerf R (1998) Asymptotic convergence of genetic algorithms. Adv Appl Probab 30(2):521–550
Article MathSciNet MATH Google Scholar
Finn LS (1992) Detection, measurement, and gravitational radiation. Phys Rev D 46:5236
Article Google Scholar
Gray RM (2006) Toeplitz and circulant matrices: a review. Found Trends Commun Inf Theory 2(3):155–239
Article MATH Google Scholar
Allen B (2005) A chi**2 time-frequency discriminator for gravitational wave detection. Phys Rev D 71:062001
Article Google Scholar

Download references

Acknowledgements

We would like to thank Prayush Kumar, Jun Li, Caroline Mallary, Eamonn O’Shea, and Matthew Wise for helpful discussions, and Vishal Tiwari for writing scripts used to compute efficiency curves. S. E. F. and D. S. D. are partially supported by NSF Grant PHY-1806665 and DMS-1912716. G.K. acknowledges research support from NSF Grants Nos. PHY-1701284, PHY-2010685 and DMS-1912716. All authors acknowledge research support from ONR/DURIP Grant No. N00014181255, which funds the computational resources used in our work. D. S. D. is partially supported by the Massachusetts Space Grant Consortium.

Author information

Authors and Affiliations

Department of Mathematics, Computer and Information Science, University of Massachusetts, Dartmouth, MA, 02747, USA
Dwyer S. Deighan
Department of Mathematics, Center for Scientific Computing and Visualization Research, University of Massachusetts, Dartmouth, MA, 02747, USA
Scott E. Field
Max-Planck-Institut für Gravitationsphysik, Leibniz Universität Hannover, 30167, Hannover, Germany
Collin D. Capano
Department of Physics, Center for Scientific Computing and Visualization Research, University of Massachusetts, Dartmouth, MA, 02747, USA
Gaurav Khanna
Department of Physics, University of Rhode Island, Kingston, RI, 02881, USA
Gaurav Khanna

Authors

Dwyer S. Deighan
View author publications
You can also search for this author in PubMed Google Scholar
Scott E. Field
View author publications
You can also search for this author in PubMed Google Scholar
Collin D. Capano
View author publications
You can also search for this author in PubMed Google Scholar
Gaurav Khanna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Scott E. Field.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Fourier transform and inner product conventions

We summarize our conventions, which vary somewhat in the literature. Given a time domain vector, ${\mathbf {a}}$, the discrete version of the Fourier transform of ${\mathbf {a}}$ evaluated at frequency $f_p = p/T$ is given by

$$\begin{aligned} {{\tilde{a}}}(f_p) = {\tilde{a}}[p]&= \Delta t \sum _{n=0}^{N-1} a(t_n) e^{-2 \pi i f_p n \Delta t} \\&= \Delta t \sum _{n=0}^{N-1} a(t_n) e^{-2 \pi i n \frac{p}{N}} , \end{aligned}$$

(21)

where $0 \le p \le N-1$. Notice that the zero frequency ($f_p=0$) corresponds to $p = 0$, positive frequencies ($0< f_p < f_s / 2$) to values in the range $0 < p \le N/2$, and negative frequencies ($- f_s / 2 \le f < 0$) correspond to values in the range $N/2< p < N$. This follows from the usual assumptions that the signal is both periodic in the observation duration, ${a}(t) = {a}(t \pm T)$, and compactly supported, ${\tilde{a}}(f) = 0$ for $|f| \ge f_s / 2$, where $f_s = 1 / \Delta t$ is the sampling rate and $f_s / 2$ is the Nyquist frequency. Consequently, the Fourier transformed signal is periodic in k with a period of N, ${\tilde{a}}(f_k) = {\tilde{a}}(f_k \pm N \Delta f)$. The value $p = N/2$ corresponds to the Fourier transform at the maximum resolvable frequencies, $-f_s/2$ and $f_s/2$, for a given choice of $\Delta t$.

Given the Fourier transformed data, ${\tilde{a}}$ and ${\tilde{b}}$, the noise-weighted inner product $\langle \cdot , \cdot \rangle$ between ${\tilde{a}}$ and ${\tilde{b}}$ is defined as

$$\begin{aligned} \langle a , b\rangle = 2 \Delta f \sum _{i=0}^{N-1} \frac{a(f_i) b^*(f_i)}{S_n(f_i)} \approx 2 \int _{-f_s /2}^{f_s /2} \frac{a(f) b^*(f)}{S_n(f)} \hbox {d}f . \end{aligned}$$

(22)

Notice that by convention the inner product is defined with an overall factor of 2, but unlike Eq. 6 the full set of positive and negative frequencies are used. The continuum limit ($\Delta f \rightarrow 0$) of the summation makes clear that this is a (discretized) inner product between a(f) and b(f) over the domain $|f| \le f_s /2$. Note that because the time-domain signal is real the Fourier transformed signal satisfies ${\tilde{a}}^*(f) = {\tilde{a}}(-f)$. As a result, the inner product expression can be “folded-over”

$$\begin{aligned} \langle a , b\rangle = 4 {\mathfrak {R}}\sum _{i=0}^{N/2-1} \frac{a(f_i) b^*(f_i)}{S_n(f_i)} \approx 4 {\mathfrak {R}}\int _{0}^{f_s /2} \frac{a(f) b^*(f)}{S_n(f)} \hbox {d}f , \end{aligned}$$

(23)

which now features an integral over the positive frequencies and shows the inner product to be manifestly real. We then arrive at Eq. 6. This motivates the use of the term “inner product” when discussing Eq. 6 despite the fact that when taken at face value it does not satisfy the usual properties of an inner product while Eq. (22) does. Finally, some authors set the noise at the Nyquist frequency to 0 (see, for example, Ref. [59] discussion after Eq. 7.1.) frequency.

Appendix 2: Derivation of conditional probabilities used in likelihood-ratio test

A derivation of the standard inner product used in gravitational-wave analyses can be found in Ref. [81], which makes use of methods laid out in Ref. [58]. Here, we provide a brief derivation to highlight some of the assumptions that go into the classical filter.

In the absence of a signal, we assume that the detector is a stochastic process that outputs Gaussian noise with zero mean. The likelihood that some observed output ${\mathbf {s}}$ is purely noise is therefore given by a N-dimensional multivariate normal distribution

$$\begin{aligned} p({\mathbf {s}}| n) = \frac{\exp \left[ -\frac{1}{2}{\mathbf {s}}^{\mathsf {T}}\varvec{\Sigma }^{-1} {\mathbf {s}}\right] }{\sqrt{(2\pi )^{N} \det \varvec{\Sigma }}}, \end{aligned}$$

(24)

where $\varvec{\Sigma }$ is the covariance matrix of the noise, and $\det \varvec{\Sigma }$ is its determinant.

It is also common to assume that the noise is wide-sense stationary and ergodic. This is generally true on the time scales that a gravitational-wave from a compact binary merger passes through the sensitive band of the detector ($\sim \max {\mathcal {O}}(100\,\mathrm {s})$). In that case, $\varvec{\Sigma }$ is a real symmetric Toeplitz matrix with elements

$$\begin{aligned} \varSigma [j, k] = \frac{1}{2} R_{ss}[k-j] \end{aligned}$$

where

$$\begin{aligned} R_{ss}[k] \equiv \lim _{n\rightarrow \infty } \frac{1}{n} \sum _{l=-n}^{n-1} s[l]s[l+k] \end{aligned}$$

(25)

is the autocorrelation function of the data.

There is no general, analytic solution for $\varvec{\Sigma }^{-1}$. However, if $R_{ss}\rightarrow 0$ in finite time $\tau _{\max }$ and the observation time $T > 2\tau _{\max }$ (i.e., $\lceil N/2 \rceil > \lceil \tau _{\max }/\Delta t \rceil$), then $\varvec{\Sigma }$ is nearly a circulant matrix; it only differs in the upper-right and lower-left corners. All circulant matrices, regardless of the values of their elements, have the same eigenvectors [82]

$$\begin{aligned} u_p[k] = \frac{1}{\sqrt{N}} e^{-2\pi i k p/N}. \end{aligned}$$

(26)

We make the approximation that $\varvec{\Sigma }$ is circulant and use these eigenvectors to solve the eigenvalue equation, yielding

$$\begin{aligned} \lambda _p = \frac{1}{2} {\mathfrak {R}}\left\{ \sum _{l=-N/2}^{N/2-1} R_{ss}[l] e^{-2\pi i p l /N} \right\} . \end{aligned}$$

(27)

(The ${\mathfrak {R}}$ arises because the covariance is real and symmetric.) The error in this approximation decreases with increasing observation time; indeed, the eigenvalues of $\varvec{\Sigma }$ asymptote to Eq. 27 as $N \rightarrow \infty$ [82]. The autocorrelation function of ground-based gravitational-wave detectors $\approx 0$ for $\tau > {\mathcal {O}}(10\,\mathrm {ms})$. Since the observation time for a gravitational wave is $>{\mathcal {O}}(\mathrm {s})$, this approximation is valid in practice.

We recognize Eq. 27 as $1/\Delta t$ times the real part of the discrete Fourier transform of $R_{ss}[p]$.^{Footnote 2} Therefore, via the Wiener–Khinchin theorem,

$$\begin{aligned} \lambda _p = \frac{S_n[p]}{2\Delta t} \end{aligned}$$

(28)

where $S_n[p]$ is the discrete approximation of the power spectral density (PSD) of the noise at frequency $p/T \equiv p \Delta f$. Since the matrix of eigenvectors ${\mathbf {U}}$ are unitary, we have

$$\begin{aligned} \varSigma ^{-1}[j, k]&\approx \left[ {\mathbf {U}}\varvec{\Lambda }^{-1} {\mathbf {U}}^\dagger \right] [j, k] \\&\approx \frac{2 \Delta t}{N} \sum _{p=0}^{N-1} \frac{e^{-2\pi i j p/N} e^{2\pi i k p/N}}{S_n[p]} \\&= c_{jk} + 4 \Delta f (\Delta t)^2 \sum _{p=1}^{N/2-1} \frac{\cos \left( 2\pi (j-k)p/N\right) }{S_n[p]}, \end{aligned}$$

(29)

To go from the second to the third line, we have substituted $1/N = \Delta f \Delta t$ and have made use of the fact that $S_n[p]$ is symmetric about N/2; $c_{jk}$ depends only on the $p=0$ and $p=N/2$ terms, which correspond to the DC and Nyquist frequencies, respectively.

Gravitational-wave detectors have peak sensitivity within a particular frequency band $[f_0, f_{\max }]$ (for current generation detectors, this is $f \sim [20, 2000]\,$Hz). Outside of this range we can effectively treat the PSD as being infinite, making all terms in Eq. (29) with $p < \lfloor f_0 / \Delta f \rfloor \equiv p_0$ zero. Likewise, if we choose a sample rate $1/\Delta t > 2 f_{\max }$, then the Nyquist term is also effectively zero. The exponential term in the likelihood is therefore

$$\begin{aligned} \left[ {\mathbf {s}}^\mathsf {T}\varvec{\Sigma }^{-1} {\mathbf {s}}\right]&\approx 4 \Delta f \sum _{p=p_0}^{N/2-1} (\Delta t)^2 \sum _{j,k=0}^{N-1} s[j]s[k]\frac{\cos \left( 2\pi (j-k)p/N\right) }{S_n[p]} \\&\approx 4 \Delta f \sum _{p=p_0}^{N/2-1} \frac{\left| {\tilde{s}}\right| ^2[p]}{S_n[p]}. \end{aligned}$$

In going from the first to the second line, we have again recognized the sums over j, k as the discrete Fourier transforms over the real time-series data. We can further simplify this by defining the inner product Eq. (6), yielding Eq. (5) for the likelihood.

Appendix 3: How to generate Gaussian noise

Somewhat surprisingly, we are unaware of a resource that describes how to implement Eq. (4) to generate time-domain noise realizations. When implementing this expression one encounters sufficiently many subtleties that we will summarize our recipe here.

Eq. (4) specifies the statistical properties satisfied by the Fourier coefficients of the noise. Note that in the literature similar expressions for the discrete Fourier transform coefficients are sometimes given, which differs from ours.

Since the frequency-domain noise, ${\tilde{n}}(f_i)$, is complex, we need to be careful when sampling the real and imaginary parts. For example, if the desired property is $\langle {\tilde{n}}^*(f_i) {\tilde{n}}(f_j) \rangle =\delta _{ij}$, then

$$\begin{aligned} {\mathfrak {R}}({\tilde{n}}(f_i)) \sim {\mathcal N}\left( 0,\frac{1}{2}\right) , \qquad {\mathfrak {I}}({\tilde{n}}(f_i)) \sim {\mathcal N}\left( 0,\frac{1}{2}\right) , \end{aligned}$$

(30)

which gives

$$\begin{aligned} \langle {\tilde{n}}^*(f_i) {\tilde{n}}(f_j) \rangle = \langle {\mathfrak {R}}({\tilde{n}}(f_i))^2 + {\mathfrak {I}}({\tilde{n}}(f_i))^2 \rangle = \frac{1}{2} + \frac{1}{2} = 1 . \end{aligned}$$

(31)

Furthermore, for real time-domain functions we have ${\tilde{n}}^*(f) = n(-f)$ and so only the non-negative frequencies are independently sampled. When $f=0$, this condition implies that n(0) is real, whence ${\tilde{n}}(0) \sim {\mathcal N}(0,1)$. A similar property holds at the Nyquist frequency.

The neural networks considered in this paper use time-domain data. Synthetic time-domain noise realizations are constructed by taking an inverse Fourier transform of our frequency domain noise. In the time-domain, Eq. (4) becomes,

$$\begin{aligned} \langle n(t_i) \rangle = 0 , \qquad \langle n^2(t_i) \rangle = \frac{\Delta f}{2} \sum _{i=0}^{N-1} S_n(f_i), \end{aligned}$$

(32)

which follows directly from Eq. (4) and properties of the Fourier transform. We found Eq. (32) to be an indispensable sanity test of our time-domain noise realizations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Deighan, D.S., Field, S.E., Capano, C.D. et al. Genetic-algorithm-optimized neural networks for gravitational wave classification. Neural Comput & Applic 33, 13859–13883 (2021). https://doi.org/10.1007/s00521-021-06024-4

Download citation

Received: 29 August 2020
Accepted: 07 April 2021
Published: 24 April 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s00521-021-06024-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Genetic-algorithm-optimized neural networks for gravitational wave classification

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

Visualizing and Understanding Convolutional Networks

Fundamentals of Artificial Neural Networks and Deep Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: Fourier transform and inner product conventions

Appendix 2: Derivation of conditional probabilities used in likelihood-ratio test

Appendix 3: How to generate Gaussian noise

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Genetic-algorithm-optimized neural networks for gravitational wave classification

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

Visualizing and Understanding Convolutional Networks

Fundamentals of Artificial Neural Networks and Deep Learning

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: Fourier transform and inner product conventions

Appendix 2: Derivation of conditional probabilities used in likelihood-ratio test

Appendix 3: How to generate Gaussian noise

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation