Skip to main content

Sparse Transformer Hawkes Process for Long Event Sequences

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases: Research Track (ECML PKDD 2023)

Abstract

Large quantities of asynchronous event sequence data such as crime records, emergence call logs, and financial transactions are becoming increasingly available from various fields. These event sequences often exhibit both long-term and short-term temporal dependencies. Variations of neural network based temporal point processes have been widely used for modeling such asynchronous event sequences. However, many current architectures including attention based point processes struggle with long event sequences due to computational inefficiency. To tackle the challenge, we propose an efficient sparse transformer Hawkes process (STHP), which has two components. For the first component, a transformer with a novel temporal sparse self-attention mechanism is applied to event sequences with arbitrary intervals, mainly focusing on short-term dependencies. For the second component, a transformer is applied to the time series of aggregated event counts, primarily targeting the extraction of long-term periodic dependencies. Both components complement each other and are fused together to model the conditional intensity function of a point process for future event forecasting. Experiments on real-world datasets show that the proposed STHP outperforms baselines and achieves significant improvement in computational efficiency without sacrificing prediction performance for long sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.kaggle.com/datasets/AnalyzeBoston/crimes-in-boston.

  2. 2.

    https://www.kaggle.com/datasets/mchirico/montcoalert.

References

  1. Bacry, E., Mastromatteo, I., Muzy, J.F.: Hawkes processes in finance. Market Microstruct. Liquidity 1(01), 1550005 (2015)

    Article  Google Scholar 

  2. Bai, T., et al.: CTRec: a long-short demands evolution model for continuous-time recommendation. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 675–684 (2019)

    Google Scholar 

  3. Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)

  4. Child, R., Gray, S., Radford, A., Sutskever, I.: Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509 (2019)

  5. Deshpande, P., Marathe, K., De, A., Sarawagi, S.: Long horizon forecasting with temporal point processes. In: Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 571–579 (2021)

    Google Scholar 

  6. Du, N., Dai, H., Trivedi, R., Upadhyay, U., Rodriguez, M., Song, L.: Recurrent marked temporal point process. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 447–456 (2016)

    Google Scholar 

  7. Du, N., Dai, H., Trivedi, R., Upadhyay, U., Gomez-Rodriguez, M., Song, L.: Recurrent marked temporal point processes: embedding event history to vector. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 1555–1564 (2016)

    Google Scholar 

  8. Farajtabar, M., Du, N., Rodriguez, M.G., Valera, I., Zha, H., Song, L.: Shaping social activity by incentivizing users. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS), pp. 2474–2482 (2014)

    Google Scholar 

  9. Farajtabar, M., Wang, Y., Rodriguez, M.G., Li, S., Zha, H., Song, L.: Coevolve: a joint point process model for information diffusion and network co-evolution. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS), pp. 1954–1962 (2015)

    Google Scholar 

  10. Hawkes, A.G.: Spectra of some self-exciting and mutually exciting point processes. Biometrika 58(1), 83–90 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  11. Isham, V., Westcott, M.: A self-correcting point process. Stoch. Process. Their Appl. 8(3), 335–347 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  12. Jaszczur, S., et al.: Sparse is enough in scaling transformers. Adv. Neural. Inf. Process. Syst. 34, 9895–9907 (2021)

    Google Scholar 

  13. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  14. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  15. Mei, H., Eisner, J.M.: The neural Hawkes process: a neurally self-modulating multivariate point process. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS), pp. 6754–6764 (2017)

    Google Scholar 

  16. Mohler, G., Porter, M.D., Carter, J., LaFree, G.: Learning to rank spatio-temporal event hotspots. In: Proceedings of the 7th International Workshop on Urban Computing (2018)

    Google Scholar 

  17. Mohler, G., Raje, R., Carter, J., Valasik, M., Brantingham, J.: A penalized likelihood method for balancing accuracy and fairness in predictive policing. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2454–2459 (2018)

    Google Scholar 

  18. Ross, S.M., et al.: Stochastic Processes, vol. 2. Wiley, New York (1996)

    MATH  Google Scholar 

  19. Shang, J., Sun, M.: Geometric Hawkes processes with graph convolutional recurrent neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4878–4885 (2019)

    Google Scholar 

  20. Shelton, C.R., Qin, Z., Shetty, C.: Hawkes process inference with missing data. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)

    Google Scholar 

  21. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  22. Wang, L., Zhang, W., He, X., Zha, H.: Supervised reinforcement learning with recurrent neural network for dynamic treatment recommendation. In: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2447–2456 (2018)

    Google Scholar 

  23. Wang, Q., et al.: Learning deep transformer models for machine translation. arXiv preprint arXiv:1906.01787 (2019)

  24. Xiao, S., Farajtabar, M., Ye, X., Yan, J., Song, L., Zha, H.: Wasserstein learning of deep generative point process models. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  25. Xu, H., Farajtabar, M., Zha, H.: Learning granger causality for Hawkes processes. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 1717–1726 (2016)

    Google Scholar 

  26. Yan, X., Lin, L., Mitra, N.J., Lischinski, D., Cohen-Or, D., Huang, H.: ShapeFormer: transformer-based shape completion via sparse representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6239–6249 (2022)

    Google Scholar 

  27. Zhang, Q., Lipani, A., Kirnap, O., Yilmaz, E.: Self-attentive Hawkes process. In: International Conference on Machine Learning, pp. 11183–11193. PMLR (2020)

    Google Scholar 

  28. Zhou, H., et al.: Informer: beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 11106–11115 (2021)

    Google Scholar 

  29. Zuo, S., Jiang, H., Li, Z., Zhao, T., Zha, H.: Transformer Hawkes process. In: International Conference on Machine Learning, pp. 11692–11702. PMLR (2020)

    Google Scholar 

Download references

Acknowledgement

This work was supported in part by the NSF under Grant No. 1927513, No. 1943486, No. 2147253, and NSF EPSCoR-Louisiana program (No. 1946231).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mingxuan Sun .

Editor information

Editors and Affiliations

Ethics declarations

Ethic Statement

In this ethical statement, we will discuss the ethical implications of our work in relation to machine learning and data mining. We recognize the importance of ethics in all aspects of our work and are committed to upholding ethical principles in our research and its application. In this statement, we will outline the potential ethical issues that arise from our work and the steps we have taken to mitigate these issues.

Collection and Processing of Personal Data

The datasets of our work are all public datasets. We have obtained all necessary permissions and have followed best practices for data download, processing, and storage to ensure that the privacy of individuals is protected.

Inference of Personal Information

Our work does not involve the inference of personal information from data.

Potential Use in Policing

Our work may have potential applications in policing contexts. We are aware of the potential ethical implications of this and are committed to ensuring that our work is not used in ways that violate human rights or result in harm to individuals. We will carefully consider the potential uses of our work and will take appropriate steps to prevent its misuse.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, Z., Sun, M. (2023). Sparse Transformer Hawkes Process for Long Event Sequences. In: Koutra, D., Plant, C., Gomez Rodriguez, M., Baralis, E., Bonchi, F. (eds) Machine Learning and Knowledge Discovery in Databases: Research Track. ECML PKDD 2023. Lecture Notes in Computer Science(), vol 14173. Springer, Cham. https://doi.org/10.1007/978-3-031-43424-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-43424-2_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-43423-5

  • Online ISBN: 978-3-031-43424-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics