Iterative convolutional enhancing self-attention Hawkes process with time relative position encoding

Bian, Wei; Li, Chenlong; Hou, Hongwei; Liu, Xiufang

doi:10.1007/s13042-023-01780-2

Iterative convolutional enhancing self-attention Hawkes process with time relative position encoding

Original Article
Published: 04 February 2023

Volume 14, pages 2529–2544, (2023)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Wei Bian¹,
Chenlong Li¹,
Hongwei Hou¹ &
…
Xiufang Liu¹

183 Accesses
1 Altmetric
Explore all metrics

Abstract

Modeling Hawkes process using deep learning is superior to traditional statistical methods in the goodness of fit. However, methods based on RNN or self-attention are deficient in long-time dependence and recursive induction, respectively. Universal Transformer (UT) is an advanced framework to integrate these two requirements simultaneously due to its continuous transformation of self-attention in the depth of the position. In addition, migration of the UT framework involves the problem of effectively matching Hawkes process modeling. Thus, in this paper, an iterative convolutional enhancing self-attention Hawkes process with time relative position encoding (ICAHP-TR) is proposed, which is based on improved UT. First, the embedding maps from dense layers are carried out on sequences of arrival time points and markers to enrich event representation. Second, the deep network composed of UT extracts hidden historical information from event expression with the characteristics of recursion and the global receptive field. Third, two designed mechanics, including the relative positional encoding on the time step and the convolution enhancing perceptual attention are adopted to avoid losing dependencies between relative and adjacent positions in the Hawkes process. Finally, the hidden historical information is mapped by Dense layers as parameters in Hawkes process intensity function, thereby obtaining the likelihood function as the network loss. The experimental results show that the proposed methods demonstrate the effectiveness of synthetic datasets and real-world datasets from the perspective of both the goodness of fit and predictive ability compared with other baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Temporal attention augmented transformer Hawkes process

Article 08 November 2021

GTHP: a novel graph transformer Hawkes process for spatiotemporal event prediction

Article 19 March 2024

HGTHP: a novel hyperbolic geometric transformer hawkes process for event prediction

Article 12 December 2023

Data availability

The data that support the findings of this study are openly available in a public repository ifl-tpp at https://github.com/shchur/ifl-tpp.

References

Du N, Dai H, Trivedi R, Upadhyay U, Gomez-Rodriguez M, and Song L (2016) Recurrent marked temporal point processes: Embedding event history to vector. In: Proceedings of ACM SIGKDD International Conference on knowledge discovery and data mining, pp 1555–1564. https://doi.org/10.1145/2939672.2939875
Gao H, Huang T, Liu Y, Yin Y, Li Y (2022) PPO2: location privacy-oriented task offloading to edge computing using reinforcement learning for intelligent autonomous transport systems. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TITS.2022.316-9421
Article Google Scholar
Dai Z, Zhou H, Dong X (2020) Forecasting stock market volatility: the role of gold and exchange rate. AIMS Math 5(5):5094–5105. https://doi.org/10.3934/math.2020327
Article MathSciNet MATH Google Scholar
Gao H, Qiu B, Duran Barroso RJ, Hussain W, Xu Y, Wang X (2022) TSMAE: a novel anomaly detection approach for internet of things time series data using memory-augmented autoencoder. IEEE Trans Netw Sci Eng. https://doi.org/10.1109/TNSE.2022.3163144
Article Google Scholar
Tang L, Zheng S, Zhou Z (2018) Estimation and test of restricted linear EV model with nonignorable missing covariates. Appl Math-A J Chin Univ 33(3):344–358. https://doi.org/10.1007/s11766-018-3550-8
Article MathSciNet MATH Google Scholar
Zhou Z, Tang L (2019) Testing for parametric component of partially linear models with missing covariates. Stat Pap 60(3):747–760. https://doi.org/10.1007/s00362-016-0848-6
Article MathSciNet MATH Google Scholar
Tan Z, Zheng S (2020) Extremes of a type of locally stationary Gaussian random fields with applications to Shepp statistics. J Theor Probab 33(4):2258–2279. https://doi.org/10.1007/s10959-019-00953-6
Article MathSciNet MATH Google Scholar
Chen Y, Tan Z (2019) Almost sure limit theorem for the order statistics of stationary Gaussian sequences. Filomat 32(9):3355–3364. https://doi.org/10.2298/FIL1809355C
Article MathSciNet MATH Google Scholar
Alan GH (1971) Spectra of some self-exciting and mutually exciting point processes. Biometrika 58:83–90. https://academic.oup.com/biomet/article/58/1/83/224809. Accessed 28 May 2021
Omi T, Ueda N, Aihara K (2019) Fully neural network-based model for general temporal point processes. arXiv preprint arXiv:1905.09690
Shchur O, Biloš M, Günnemann S (2019) Intensity-free learning of temporal point processes. arXiv preprint arXiv:1909.12127
Zhang Q, Lipani A, Kirnap O, and Yilmaz E (2019) Self-attentive Hawkes processes. arXiv preprint arXiv:1907.07561, 2019.
Zuo S, Jiang H, Li Z, Zhao T, Zha H (2020) Transformer Hawkes process. In: Proceedings of the 37th International Conference on machine learning, vol 119, pp 11692–11702
Gao H, Xu K, Cao M, Xiao J, Xu Q, Yin Y (2021) The deep features and attention mechanism-based method to dish healthcare under social IoT systems: an empirical study with a hand-deep local-global net. IEEE Trans Netw Sci Eng 9(1):336–347. https://doi.org/10.1109/TCSS.2021.3102591
Article Google Scholar
Xiao J, Xu H, Gao H, Bian M, Li Y (2021) A weakly supervised semantic segmentation network by aggregating seed cues: the multi-object proposal generation perspective. ACM Trans Multimed Comput Commun Appl 17(1s):1–19. https://doi.org/10.1145/3-419842
Article Google Scholar
Huang C, Liu B, Qian C, Cao J (2021) Stability on positive pseudo almost periodic solutions of HPDCNNs incorporating D operator. Math Comput Simul 190(2021):1150–1163. https://doi.org/10.1016/j.matcom.2021.06.027
Article MathSciNet MATH Google Scholar
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, and Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st lnternational Conference on Neural lnformation Processing Systems, pp 6000–6010
Bengio Y (1994) Learning long-term dependencies with gradient descent difficult. IEEE Trans Neural Netw 5(2):157–166. https://ieeexplore.ieee.org/document/279181. Accessed 3 Apr 2021
Dehghani M, Gouws S, Vinyals O, Uszkoreit J, Kaiser U (2019) Universal transformers. In: Proceedings of the International Conference on learning representations, OpenReview.net. https://openreview.net/forum?id=HyzdRiR9Y7. Accessed 1 Apr 2021
Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. arXiv preprint arXiv:1803.02155
Su J, Wang Z, Chen M (2020) Orthogonal exponential functions of the planar self-affine measures with four digits. Fractals. https://doi.org/10.1142/S0218348X20500164
Article MATH Google Scholar
Li J, Li P (2018) Inverse elastic scattering for a random source. SIAM J Math Anal 51(6):4570–4603. https://doi.org/10.1137/18M1235119
Article MathSciNet MATH Google Scholar
Mei H, Eisner JM (2017) The neural Hawkes process: a neurally self-modulating multivariate point process. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 6754–6764
Lin Z, Feng M, Santos C, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130
Gao H, Xiao J, Yin Y, Liu T, Shi J (2022) A mutually supervised graph attention network for few-shot segmentation: the perspective of fully utilizing limited samples. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3155486
Article Google Scholar
Zhang L, Liu J, Song Z, Xin Z (2021) Universal Transformer Hawkes process with adaptive recursive iteration. Eng Appl Artif Intell. https://doi.org/10.1016/j.engappai.2021.104416
Article Google Scholar
Guo R, Li J, Liu H (2018) Initiator: noise-contrastive estimation for marked temporal point process. In Proceedings of the International Joint Conference on artificial intelligence, pp 2191–2197. https://doi.org/10.24963/ijcai.2018/303
Xiao S, Farajtabar M, Ye X, Yan J, Song L, and Zha H (2017) Wasserstein learning of deep generative point process models. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 3247–3257
Xiao S, Xu H, Yan J, Farajtabar M, Yang X, Song L, Zha H (2018) Learning conditional generative models for temporal point processes. In: Proceedings of the 32nd AAAI Conference on artificial intelligence, vol 32(1), pp 6302–6310
Li S, Xiao S, Zhu S, Du N, Xie Y, Song L (2018) Learning temporal point processes via reinforcement learning. In: Proceedings of the 32nd Conference on Neural Information Processing Systems, pp 10781–10791
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Ba JL, Kiros JR, and Hinton GE (2014). Layer normalization. arXiv preprint arXiv:1607.06450, 2016
Graves A, Wayne G, Danihelka I (2014) Neural turing machines. arXiv preprint arXiv:1410.5401
Liu Z, Zhou Y, Zhang Y (2020) On inexact alternating direction implicit iteration for continuous Sylvester equations. Numer Linear Algebra Appl. https://doi.org/10.1002/nla.2320
Article MathSciNet MATH Google Scholar
Zhou W, Zhang L (2020) A modified Broyden-like quasi-Newton method for nonlinear equations. J Comput Appl Math. https://doi.org/10.1016/j.cam.-2020.112744
Article MathSciNet MATH Google Scholar
Ogata Y (1981) On Lewis’ simulation method for point processes. IEEE Trans Inf Theory IT-27(1):23–31
Article MATH Google Scholar
Kumar S, Zhang X, Leskovec J (2019) Predicting dynamic embedding trajectory in Temporal Interaction Networks. In: Proceedings of the ACM SIGKDD International Conference, pp 1269–1278. https://doi.org/10.1145/3292500.3330895.

Download references

Acknowledgements

This work was supported by the Applied Basic Research Programs of Shanxi Province (Grant no. 201901D211105).

Author information

Authors and Affiliations

School of Mathematics, Taiyuan University of Technology, Taiyuan, China
Wei Bian, Chenlong Li, Hongwei Hou & Xiufang Liu

Authors

Wei Bian
View author publications
You can also search for this author in PubMed Google Scholar
Chenlong Li
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Hou
View author publications
You can also search for this author in PubMed Google Scholar
Xiufang Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chenlong Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bian, W., Li, C., Hou, H. et al. Iterative convolutional enhancing self-attention Hawkes process with time relative position encoding. Int. J. Mach. Learn. & Cyber. 14, 2529–2544 (2023). https://doi.org/10.1007/s13042-023-01780-2

Download citation

Received: 03 November 2021
Accepted: 16 January 2023
Published: 04 February 2023
Issue Date: July 2023
DOI: https://doi.org/10.1007/s13042-023-01780-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Iterative convolutional enhancing self-attention Hawkes process with time relative position encoding

Abstract

Access this article

Similar content being viewed by others

Temporal attention augmented transformer Hawkes process

GTHP: a novel graph transformer Hawkes process for spatiotemporal event prediction

HGTHP: a novel hyperbolic geometric transformer hawkes process for event prediction

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Iterative convolutional enhancing self-attention Hawkes process with time relative position encoding

Abstract

Access this article

Similar content being viewed by others

Temporal attention augmented transformer Hawkes process

GTHP: a novel graph transformer Hawkes process for spatiotemporal event prediction

HGTHP: a novel hyperbolic geometric transformer hawkes process for event prediction

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation