Advertisement

Construction of Histogram with Variable Bin-Width Based on Change Point Detection

  • Takayasu FushimiEmail author
  • Kiyoto Iwasaki
  • Seiya Okubo
  • Kazumi Saito
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11828)

Abstract

For a given set of samples with a numeric variable and a set of nominal variables, we address a problem of constructing a histogram drawn by K bins with variable widths, so as to have relatively large numbers of narrow bins for some ranges where numeric values distribute densely and change substantially, while small numbers of wide bins for the other ranges, together with the characteristic nominal values for describing these bins as annotation terms. For this purpose, we propose a new method, which incorporates a change point detection method to numeric values based on an L1 or L2 error criterion, and an annotation terms identification method for these bins based on the z-score with respect to the distribution of nominal values. In our experiments using four datasets of humidity deficit (HD) collected from vinyl greenhouses, we show that our proposed method can construct more natural histograms with appropriate variable bin widths than those with an equal bin width constructed by the standard method based on square-root choice or Sturges’ formula, the histograms constructed with the L1 error criterion has more desirable property than those with the L2 error criterion, and our method can produce a series of naturally interpretable annotation terms for the constructed bins.

Keywords

Histogram Change point detection Variable bin-width Visualization 

Notes

Acknowledgments

This material is based upon work supported by JSPS Grant-in-Aid for Scientific Research (C) (No. 18K11441), (B) (No. 17H01826) and Early-Career Scientists (No. 19K20417).

References

  1. 1.
    Saishin nogyo gijutsu yasai. 8. Rural Culture Association Japan (2015). http://amazon.co.jp/o/ASIN/454015057X/
  2. 2.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009).  https://doi.org/10.1145/1541880.1541882CrossRefGoogle Scholar
  3. 3.
    Denby, L., Mallows, C.: Variations on the histogram. J. Comput. Graph. Stat. 18, 21–31 (2009)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Kim, C.J., Piger, J., Startz, R.: Estimation of Markov regime-switching regression models with endogenous switching. J. Econom. 143(2), 263–273 (2008)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Saito, K., Ohara, K., Kimura, M., Motoda, H.: Change point detection for burst analysis from an observed information diffusion sequence of tweets. J. Intell. Inf. Syst. 44(2), 243–269 (2015).  https://doi.org/10.1007/s10844-013-0283-2CrossRefGoogle Scholar
  6. 6.
    Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization, 2nd edn. Wiley, New York (1992)CrossRefGoogle Scholar
  7. 7.
    Yamada, H., Watanabe, C.: Approach feature extraction of nature image with observation report and transition of histogram. Technical Report 16 (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Takayasu Fushimi
    • 1
    Email author
  • Kiyoto Iwasaki
    • 2
  • Seiya Okubo
    • 3
  • Kazumi Saito
    • 4
    • 5
  1. 1.School of Computer ScienceTokyo University of TechnologyHachiojiJapan
  2. 2.Industrial Research Institute of Shizuoka PrefectureShizuokaJapan
  3. 3.School of Management and InformationUniversity of ShizuokaShizuokaJapan
  4. 4.Faculty of ScienceKanagawa UniversityHiratsukaJapan
  5. 5.Center for Advanced Intelligence ProjectRIKENTokyoJapan

Personalised recommendations