Skip to main content
Log in

Iterative convolutional enhancing self-attention Hawkes process with time relative position encoding

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Modeling Hawkes process using deep learning is superior to traditional statistical methods in the goodness of fit. However, methods based on RNN or self-attention are deficient in long-time dependence and recursive induction, respectively. Universal Transformer (UT) is an advanced framework to integrate these two requirements simultaneously due to its continuous transformation of self-attention in the depth of the position. In addition, migration of the UT framework involves the problem of effectively matching Hawkes process modeling. Thus, in this paper, an iterative convolutional enhancing self-attention Hawkes process with time relative position encoding (ICAHP-TR) is proposed, which is based on improved UT. First, the embedding maps from dense layers are carried out on sequences of arrival time points and markers to enrich event representation. Second, the deep network composed of UT extracts hidden historical information from event expression with the characteristics of recursion and the global receptive field. Third, two designed mechanics, including the relative positional encoding on the time step and the convolution enhancing perceptual attention are adopted to avoid losing dependencies between relative and adjacent positions in the Hawkes process. Finally, the hidden historical information is mapped by Dense layers as parameters in Hawkes process intensity function, thereby obtaining the likelihood function as the network loss. The experimental results show that the proposed methods demonstrate the effectiveness of synthetic datasets and real-world datasets from the perspective of both the goodness of fit and predictive ability compared with other baseline methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The data that support the findings of this study are openly available in a public repository ifl-tpp at https://github.com/shchur/ifl-tpp.

References

  1. Du N, Dai H, Trivedi R, Upadhyay U, Gomez-Rodriguez M, and Song L (2016) Recurrent marked temporal point processes: Embedding event history to vector. In: Proceedings of ACM SIGKDD International Conference on knowledge discovery and data mining, pp 1555–1564. https://doi.org/10.1145/2939672.2939875

  2. Gao H, Huang T, Liu Y, Yin Y, Li Y (2022) PPO2: location privacy-oriented task offloading to edge computing using reinforcement learning for intelligent autonomous transport systems. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TITS.2022.316-9421

    Article  Google Scholar 

  3. Dai Z, Zhou H, Dong X (2020) Forecasting stock market volatility: the role of gold and exchange rate. AIMS Math 5(5):5094–5105. https://doi.org/10.3934/math.2020327

    Article  MathSciNet  MATH  Google Scholar 

  4. Gao H, Qiu B, Duran Barroso RJ, Hussain W, Xu Y, Wang X (2022) TSMAE: a novel anomaly detection approach for internet of things time series data using memory-augmented autoencoder. IEEE Trans Netw Sci Eng. https://doi.org/10.1109/TNSE.2022.3163144

    Article  Google Scholar 

  5. Tang L, Zheng S, Zhou Z (2018) Estimation and test of restricted linear EV model with nonignorable missing covariates. Appl Math-A J Chin Univ 33(3):344–358. https://doi.org/10.1007/s11766-018-3550-8

    Article  MathSciNet  MATH  Google Scholar 

  6. Zhou Z, Tang L (2019) Testing for parametric component of partially linear models with missing covariates. Stat Pap 60(3):747–760. https://doi.org/10.1007/s00362-016-0848-6

    Article  MathSciNet  MATH  Google Scholar 

  7. Tan Z, Zheng S (2020) Extremes of a type of locally stationary Gaussian random fields with applications to Shepp statistics. J Theor Probab 33(4):2258–2279. https://doi.org/10.1007/s10959-019-00953-6

    Article  MathSciNet  MATH  Google Scholar 

  8. Chen Y, Tan Z (2019) Almost sure limit theorem for the order statistics of stationary Gaussian sequences. Filomat 32(9):3355–3364. https://doi.org/10.2298/FIL1809355C

    Article  MathSciNet  MATH  Google Scholar 

  9. Alan GH (1971) Spectra of some self-exciting and mutually exciting point processes. Biometrika 58:83–90. https://academic.oup.com/biomet/article/58/1/83/224809. Accessed 28 May 2021

  10. Omi T, Ueda N, Aihara K (2019) Fully neural network-based model for general temporal point processes. arXiv preprint arXiv:1905.09690

  11. Shchur O, Biloš M, Günnemann S (2019) Intensity-free learning of temporal point processes. arXiv preprint arXiv:1909.12127

  12. Zhang Q, Lipani A, Kirnap O, and Yilmaz E (2019) Self-attentive Hawkes processes. arXiv preprint arXiv:1907.07561, 2019.

  13. Zuo S, Jiang H, Li Z, Zhao T, Zha H (2020) Transformer Hawkes process. In: Proceedings of the 37th International Conference on machine learning, vol 119, pp 11692–11702

  14. Gao H, Xu K, Cao M, Xiao J, Xu Q, Yin Y (2021) The deep features and attention mechanism-based method to dish healthcare under social IoT systems: an empirical study with a hand-deep local-global net. IEEE Trans Netw Sci Eng 9(1):336–347. https://doi.org/10.1109/TCSS.2021.3102591

    Article  Google Scholar 

  15. Xiao J, Xu H, Gao H, Bian M, Li Y (2021) A weakly supervised semantic segmentation network by aggregating seed cues: the multi-object proposal generation perspective. ACM Trans Multimed Comput Commun Appl 17(1s):1–19. https://doi.org/10.1145/3-419842

    Article  Google Scholar 

  16. Huang C, Liu B, Qian C, Cao J (2021) Stability on positive pseudo almost periodic solutions of HPDCNNs incorporating D operator. Math Comput Simul 190(2021):1150–1163. https://doi.org/10.1016/j.matcom.2021.06.027

    Article  MathSciNet  MATH  Google Scholar 

  17. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, and Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st lnternational Conference on Neural lnformation Processing Systems, pp 6000–6010

  18. Bengio Y (1994) Learning long-term dependencies with gradient descent difficult. IEEE Trans Neural Netw 5(2):157–166. https://ieeexplore.ieee.org/document/279181. Accessed 3 Apr 2021

  19. Dehghani M, Gouws S, Vinyals O, Uszkoreit J, Kaiser U (2019) Universal transformers. In: Proceedings of the International Conference on learning representations, OpenReview.net. https://openreview.net/forum?id=HyzdRiR9Y7. Accessed 1 Apr 2021

  20. Shaw P, Uszkoreit J, Vaswani A (2018) Self-attention with relative position representations. arXiv preprint arXiv:1803.02155

  21. Su J, Wang Z, Chen M (2020) Orthogonal exponential functions of the planar self-affine measures with four digits. Fractals. https://doi.org/10.1142/S0218348X20500164

    Article  MATH  Google Scholar 

  22. Li J, Li P (2018) Inverse elastic scattering for a random source. SIAM J Math Anal 51(6):4570–4603. https://doi.org/10.1137/18M1235119

    Article  MathSciNet  MATH  Google Scholar 

  23. Mei H, Eisner JM (2017) The neural Hawkes process: a neurally self-modulating multivariate point process. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 6754–6764

  24. Lin Z, Feng M, Santos C, Yu M, Xiang B, Zhou B, Bengio Y (2017) A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130

  25. Gao H, Xiao J, Yin Y, Liu T, Shi J (2022) A mutually supervised graph attention network for few-shot segmentation: the perspective of fully utilizing limited samples. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2022.3155486

    Article  Google Scholar 

  26. Zhang L, Liu J, Song Z, Xin Z (2021) Universal Transformer Hawkes process with adaptive recursive iteration. Eng Appl Artif Intell. https://doi.org/10.1016/j.engappai.2021.104416

    Article  Google Scholar 

  27. Guo R, Li J, Liu H (2018) Initiator: noise-contrastive estimation for marked temporal point process. In Proceedings of the International Joint Conference on artificial intelligence, pp 2191–2197. https://doi.org/10.24963/ijcai.2018/303

  28. Xiao S, Farajtabar M, Ye X, Yan J, Song L, and Zha H (2017) Wasserstein learning of deep generative point process models. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp 3247–3257

  29. Xiao S, Xu H, Yan J, Farajtabar M, Yang X, Song L, Zha H (2018) Learning conditional generative models for temporal point processes. In: Proceedings of the 32nd AAAI Conference on artificial intelligence, vol 32(1), pp 6302–6310

  30. Li S, Xiao S, Zhu S, Du N, Xie Y, Song L (2018) Learning temporal point processes via reinforcement learning. In: Proceedings of the 32nd Conference on Neural Information Processing Systems, pp 10781–10791

  31. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  32. Ba JL, Kiros JR, and Hinton GE (2014). Layer normalization. arXiv preprint arXiv:1607.06450, 2016

  33. Graves A, Wayne G, Danihelka I (2014) Neural turing machines. arXiv preprint arXiv:1410.5401

  34. Liu Z, Zhou Y, Zhang Y (2020) On inexact alternating direction implicit iteration for continuous Sylvester equations. Numer Linear Algebra Appl. https://doi.org/10.1002/nla.2320

    Article  MathSciNet  MATH  Google Scholar 

  35. Zhou W, Zhang L (2020) A modified Broyden-like quasi-Newton method for nonlinear equations. J Comput Appl Math. https://doi.org/10.1016/j.cam.-2020.112744

    Article  MathSciNet  MATH  Google Scholar 

  36. Ogata Y (1981) On Lewis’ simulation method for point processes. IEEE Trans Inf Theory IT-27(1):23–31

    Article  MATH  Google Scholar 

  37. Kumar S, Zhang X, Leskovec J (2019) Predicting dynamic embedding trajectory in Temporal Interaction Networks. In: Proceedings of the ACM SIGKDD International Conference, pp 1269–1278. https://doi.org/10.1145/3292500.3330895.

Download references

Acknowledgements

This work was supported by the Applied Basic Research Programs of Shanxi Province (Grant no. 201901D211105).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chenlong Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bian, W., Li, C., Hou, H. et al. Iterative convolutional enhancing self-attention Hawkes process with time relative position encoding. Int. J. Mach. Learn. & Cyber. 14, 2529–2544 (2023). https://doi.org/10.1007/s13042-023-01780-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-023-01780-2

Keywords

Navigation