Skip to main content
Log in

Sequential network change detection with its applications to ad impact relation analysis

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

We are concerned with the issue of tracking changes of variable dependencies from multivariate time series. Conventionally, this issue has been addressed in the batch scenario where the whole data set is given at once, and the change detection must be done in a retrospective way. This paper addresses this issue in a sequential scenario where multivariate data are sequentially input and the detection must be done in a sequential fashion. We propose a new method for sequential tracking of variable dependencies. In it we employ a Bayesian network as a representation of variable dependencies. The key ideas of our method are: (1) we extend the theory of dynamic model selection, which has been developed in the batch-learning scenario, into the sequential setting, and apply it to our issue, (2) we conduct the change detection sequentially using dynamic programming per a window where we employ the Hoeffding’s bound to automatically determine the window size. We empirically demonstrate that our proposed method is able to perform change detection more efficiently than a conventional batch method. Further, we give a new framework of an application of variable dependency change detection, which we call Ad Impact Relation analysis (AIR). In it, we detect the time point when a commercial message advertisement has given an impact on the market and effectively visualize the impact through network changes. We employ real data sets to demonstrate the validity of AIR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  • Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723

    Article  MATH  MathSciNet  Google Scholar 

  • Cheng J, Bell DA, Liu W (1997) An algorithm for bayesian belief network construction from data. In: Proceedings of international workshop on artificial intelligence and statistics, pp 83–90

  • Fearnhead P, Liu Z (2007) Online inference for multiple changepoint problems. J R Stat Soc B 69(4):589–605

    Article  MathSciNet  Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441

    Article  MATH  Google Scholar 

  • Geiger D, Heckerman D (1994) Learning Gaussian networks. Technical report, Microsoft research, mSR-TR-94-10

  • Grünwald PD (2007) The minimum description length principle. MIT Press, Cambridge

  • Guo F, Hanneke S, Fu W, Xing EP (2007) Recovering temporally rewiring networks: a model-based approach. In: Proceedings of the 24th international conference on machine learning, pp 321–328

  • Hayashi Y, Yamanishi K (2012) Sequential network change detection with its applications to ad impact relation analysis. In: Proceedings of the 12th IEEE international conference on data mining, pp 280–289

  • Hirai S, Yamanishi K (2011) Efficient computation of normalized maximum likelihood coding for Gaussian mixtures with its applications to optimal clustering. In: Proceedings of the 2011 IEEE international symposium on information theory, pp 1031–1035

  • Hirai S, Yamanishi K (2012) Detecting changes of clustering structures using normalized maximum likelihood coding. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp 343–351

  • Hirose S, Yamanishi K, Nakata T, Fujimaki R (2009) Network anomaly detection based on eigen equation compression. In: Proceedings of the 15th ACM SIGKDD conference on knowledge discovery and data mining, pp 1185–1194

  • Hoeffding W (1963) Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58(301):13–30

    Article  MATH  MathSciNet  Google Scholar 

  • Ide T, Kashima H (2004) Eigenspace-based anomaly detection in computer systems. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 440–449

  • Ide T, Lozano AC, Abe N, Liu Y (2009) Proximity-based anomaly detection using sparse structure learning. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, pp 97–108

  • Krichevsky RE, Trofimov VK (1981) The performance of universal encoding. IEEE Trans Inf Theory 27(2):199–206

    Article  MATH  MathSciNet  Google Scholar 

  • Rissanen J (2000) MDL denoising. IEEE Trans Inf Theory 46(7):2537–2543

    Article  MATH  MathSciNet  Google Scholar 

  • Rissanen J (2007) Information and complexity in statistical modeling. Springer, New York

    MATH  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    Article  MATH  Google Scholar 

  • Shtarkov YM (1987) Universal sequential coding of single messages. Transl Probl Inf Transm 23(3):3–17

    MathSciNet  Google Scholar 

  • Silander T, Myllymäki P (2006) A simple approach for finding the globally optimal Bayesian network structure. In: Proceedings of the 22nd conference on uncertainty in artificial intelligence, pp 445–452

  • Silander T, Roos T, Kontkanen P, Myllymäki P (2008) Factorized normalized maximum likelihood criterion for learning Bayesian network structures. In: Proceedings of 4th European workshop on probabilistic, graphical models, pp 257–264

  • Talih M, Hengartner N (2005) Structural learning with time-varying components: tracking the cross-section of financial time series. J R Stat Soc B 67(3):321–341

    Article  MATH  MathSciNet  Google Scholar 

  • Viterbi A (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inf Theory 13(2):260–269

    Article  MATH  Google Scholar 

  • Xuan X, Murphy K (2007) Modeling changing dependency structure in multivariate time series. In: Proceedings of the 24th international conference on machine learning, pp 1055–1062

  • Yamanishi K, Maruyama Y (2005) Dynamic syslog mining for network failure monitoring. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining, pp 499–508

  • Yamanishi K, Maruyama Y (2007) Dynamic model selection with its applications to novelty detection. IEEE Trans Inf Theory 53(6):2180–2189

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This work was partially supported by MEXT KAKENHI 23240019, Aihara Project, the FIRST program from JSPS, initiated by CSTP, Hakuhodo Corporation, NTT Corporation, and Microsoft Corporation (CORE6 Project). Specifically we thank HAKUHODO Inc. for providing us data sets for AIR and valuable comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kenji Yamanishi.

Additional information

Responsible editor: Eamonn Keogh.

An extended abstract appeared in Proceedings of the 12th IEEE International Conference on Data Mining (Hayashi and Yamanishi 2012).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hayashi, Y., Yamanishi, K. Sequential network change detection with its applications to ad impact relation analysis. Data Min Knowl Disc 29, 137–167 (2015). https://doi.org/10.1007/s10618-013-0338-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-013-0338-6

Keywords

Navigation