Discovering focal regions of slightly-aggregated sparse signals

Chen, Shu-Chun; Fushing, Hsieh; Hwang, Chii-Ruey

doi:10.1007/s00180-013-0407-8

Discovering focal regions of slightly-aggregated sparse signals

Original Paper
Published: 02 March 2013

Volume 28, pages 2295–2308, (2013)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Shu-Chun Chen¹,
Hsieh Fushing² &
Chii-Ruey Hwang¹

181 Accesses
Explore all metrics

Abstract

The characteristic aspects of dynamic distortions on a lengthy time series of i.i.d. pure noise when embedded with slightly-aggregating sparse signals are summarized into a significantly shorter recurrence time process of a chosen extreme event. We first employ the Kolmogorov–Smirnov statistic to compare the empirical recurrence time distribution with the null geometry distribution when no signal being present in the original time series. The power of such a hypothesis testing depends on varying degrees of aggregation of sparse signals: from a completely random distribution of singletons to batches of various sizes on the entire temporal span. We demonstrate the Kolmogorov–Smirnov statistic capturing the dynamic distortions due to slightly-aggregating sparse signals better than does Tukey’s Higher Criticism statistic even when the batch size is as small as five. Secondly, after confirming the presence of signals in the pure noise time series, we apply the hierarchical factor segmentation (HFS) algorithm again based on the recurrence time process to compute focal segments that contain a significantly higher intensity of signals than do the rest of the temporal regions. In a computer experiment with a given fixed number of signals, the focal segments identified by the HFS algorithm afford many folds of signal intensity which also critically depend on the degree of aggregation of sparse signals. This ratio information can facilitate better sensitivity, equivalent to a smaller false discovery rate, if the signal-discovering protocol implemented within the computed focal regions is different from that used outside of the focal regions. We also numerically compute the specificity as the total number of signals contained in the computed collection of focal regions, which indicates the inherent difficulty in the task of sparse signal discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stage data segmentation permitting multiscale change points, heavy tails and dependence

Article 25 September 2021

Detecting dynamical states from noisy time series using bicoherence

Article 13 March 2017

A Spectral Measure for the Information Loss of Temporal Aggregation

Article 31 May 2020

References

Abramovich F, Benjamini Y (1996) Adaptive thresholding of wavelet coefficients. Comput Stat Data Anal 22:351–361
Article MathSciNet Google Scholar
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B 57:289–300
MathSciNet MATH Google Scholar
Cai T, Jin J, Low M (2007) Estimation and confidence set for sparse normal mixtures. Ann Statist 35: 2421–2449
Google Scholar
Chang L-B, Goswami A, Hsieh F, Hwang C-R (2013) An invariance for the large sample empirical distribution of waiting time between successive extremes. Under review for a special volume on stochastic calculus. In: Hwang CR et al (ed) (2013) Festschrift in honor of Professor S. R. Srinivasa Varadhan on the occasion of his 70th birthday, Academia Sinica, Taipei, Taiwan
Donoho D, Jin J (2004) Higher criticism for detecting sparse heterogeneous mixtures. Ann Stat 32:962–994
Article MathSciNet MATH Google Scholar
Donoho D, Jin J (2008) Higher criticism thresholding: optimal feature selection usful features are rare and weak. Proc Natl Acad Sci 105:14790–14795
Article Google Scholar
Efron B (2004) Large-scale simultaneous hypothesis testing: the choice of a null hypothesis. J Am Stat Assoc 96:1151–1160
Article MathSciNet Google Scholar
Fushing H, Hwang CR, Lee HC, Lan YC, Horng SB (2006) Testing and mapping non-stationarity in animal behavioral processes: a case study on an individual female bean weevil. J Theor Biol 238:805–816
Article MathSciNet Google Scholar
Fushing H, Chen SC, Pollard KS (2009) A nearly exhaustive search for CpG islands on whole chromosome. Int J Biostat 5, Article 14
Fushing H, Chen S-C, Hwang C-R (2010a) Non-parametric decoding on discrete time series and its application in bioinformatics. Stat Biosci 2:18–40
Article Google Scholar
Fushing H, Chen SC, Lee HJ (2010b) Computing circadian rhythmic patterns and beyond: a new non-Fourier analysis. Comput Stat 24:409–430
Article MathSciNet Google Scholar
Fushing H, Chen SC, Lee HJ (2010c) Statistical computations on biological rhythms I: dissecting variable cycles and measuring phase shifts in activity event time series. J Comput Graph Stat 19:221–239
Article MathSciNet Google Scholar
Fushing H, Ferrer E, Chen SC, Chow SM (2010d) Dynamics of dydic interaction I: exploring non-stationarity of intra- and inter-individual affective processes via hierarchical segmentation and stochastic small-world networks. Psychometrika 75:351–372
Article MathSciNet MATH Google Scholar
Fushing H, Chen SC, Hwang C-R (2012) Discovering stock dynamics through multidimensional volatility-phases. Quant Financ 12:213–230
Article MathSciNet MATH Google Scholar
Hall P, Jin J (2008) Properties of higher criticism under strong dependence. Ann Stat 36:381–402
Article MathSciNet MATH Google Scholar
Jeng XJ, Cai T, Li H (2010) Robust identification of sparse segments in ultra-high dimensional data analysis. J Am Stat Assoc 105:1156–1166
Google Scholar
Jin J (2007) Proportion of nonzero normal means: univeral oracle equivalences and uniformly consistent estimates. J R Stat Soc Ser B 70:461–493
Article Google Scholar
Jin J, Cai T (2007) Estimating the null and the proportion of non-null effects in large scale multiple comparison. J Am Stat Assoc 102:496–506
MathSciNet Google Scholar
Kac M (1947) On the notion of recurrence in discrete stochastic processes. Bull Am Math Soc 53:1002–1010
Article MATH Google Scholar
Tukey J (1989) Higher criticism for individual significance in several tables or parts of tables. Princeton University, Princeton (Internal working paper)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Mathematics, Academia Sinica, Taipei, Taiwan
Shu-Chun Chen & Chii-Ruey Hwang
Department of Statistics, University of California, Mathematical Sciences Building, Davis, CA, 95616, USA
Hsieh Fushing

Authors

Shu-Chun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hsieh Fushing
View author publications
You can also search for this author in PubMed Google Scholar
Chii-Ruey Hwang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hsieh Fushing.

Additional information

This research is supported in part by the NSF under Grant DMS 1007219 (co-funded by Cyber-enabled Discovery and Innovation (CDI) program).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, SC., Fushing, H. & Hwang, CR. Discovering focal regions of slightly-aggregated sparse signals. Comput Stat 28, 2295–2308 (2013). https://doi.org/10.1007/s00180-013-0407-8

Download citation

Received: 14 July 2011
Accepted: 02 February 2013
Published: 02 March 2013
Issue Date: October 2013
DOI: https://doi.org/10.1007/s00180-013-0407-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discovering focal regions of slightly-aggregated sparse signals

Abstract

Access this article

Similar content being viewed by others

Two-stage data segmentation permitting multiscale change points, heavy tails and dependence

Detecting dynamical states from noisy time series using bicoherence

A Spectral Measure for the Information Loss of Temporal Aggregation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Discovering focal regions of slightly-aggregated sparse signals

Abstract

Access this article

Similar content being viewed by others

Two-stage data segmentation permitting multiscale change points, heavy tails and dependence

Detecting dynamical states from noisy time series using bicoherence

A Spectral Measure for the Information Loss of Temporal Aggregation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation