Skip to main content
Log in

Statistical test of VEP waveform equality

  • Published:
Documenta Ophthalmologica Aims and scope Submit manuscript

Abstract

The aim of the study was to describe a theory and method for inferring the statistical significance of a visually evoked cortical potential (VEP) recording. The statistical evaluation is predicated on the pre-stimulus VEP as estimates of the cortical potentials expected when the stimulus does not produce an effect, a mathematical transform to convert the voltages into standard deviations from zero, and a time-series approach for estimating the variability of between-session VEPs under the null hypothesis. Empirical and Monte Carlo analyses address issues concerned with testability, statistical validity, clinical feasibility, as well as limitations of the proposed method. We conclude that visual electrophysiological recordings can be evaluated as a statistical study of n = 1 subject using time-series analysis when confounding effects are adequately controlled. The statistical test can be performed on either a single VEP or the difference between pairs of VEPs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Odom JV, Bach M, Barber C, Brigell M, Marmor MF, Tormene AP, Holder GE, Vaegan (2004) Visual evoked potentials standard. Doc Ophthalmol 108:115–123

    Article  PubMed  Google Scholar 

  2. Lv J, Simpson DM, Bell SL (2007) Objective detection of evoked potentials using a bootstrap technique. Med Eng Phys 29:191–198

    Article  PubMed  Google Scholar 

  3. Wright T, Nilsson J, Gerth C, Westall C (2008) A comparison of signal detection techniques in the multifocal electroretinogram. Doc Ophthalmol 117(2):163–170

    Article  PubMed  Google Scholar 

  4. Rodarte C, Hood DC, Yang EB, Grippo T, Greenstein VC, Liebmann JM, Ritch R (2008) The effects of glaucoma on the latency of the multifocal visual evoked potential. Br J Ophthalmol 90:1132–1136

    Article  Google Scholar 

  5. Bach M, Mauer JP, Wolf ME (2008) Visual evoked potential-based acuity assessment in normal vision, artificially degraded vision, and in patients. Br J Ophthalmol 92:396–403

    Article  CAS  PubMed  Google Scholar 

  6. Boon MY, Henry BI, Suttle CM, Dain SJ (2008) The correlation dimension: a useful objective measure of transient visual evoked potential? J Vis 8(1):1–21

    Article  PubMed  Google Scholar 

  7. Achim A (1995) Signal detection in averaged evoked potentials: Monte Carlo comparison of the sensitivity of different methods. Electroencephalogr Clin Neurophysiol 96:574–584

    Article  CAS  PubMed  Google Scholar 

  8. Wasserman S, Bockenholt U (1989) Bootstrapping: applications to psychophysiology. Psychophysiology 26(2):208–221

    Article  CAS  PubMed  Google Scholar 

  9. Efron B (1982) The jackknife, the bootstrap and other resampling plans. Capital City Press, Mountpelier

    Google Scholar 

  10. Srebro R (1996) A bootstrap method to compare the shapes of two scalp fields. Electroencephalogr Clin Neurophysiol 100:25–32

    Article  CAS  PubMed  Google Scholar 

  11. Blair RC, Karniski W (1993) An alternative method for significance testing of waveform difference potentials. Psychophysiology 30:518–524

    Article  CAS  PubMed  Google Scholar 

  12. Fabiani M, Gratton G, Corballis PM, Cheng J, Friedman D (1998) Bootstrap assessment of the reliability of maxima in surface maps of brain activity of individual subjects derived with electrophysiological and optical methods. Behav Res Methods Instrum Comput 30(1):78–86

    Google Scholar 

  13. Di Nocera F, Ferlazzo F (2000) Resampling approach to statistical inference: bootstrapping from event-related potentials data. Behav Res Methods Instrum Comput 32:111–119

    CAS  PubMed  Google Scholar 

  14. Ramaekers JG, Uiterwijk MC, O’Hanlon JF (1992) Effects of loratadine and cetirizine on actual driving and psychometric test performance, and EEG during driving. Eur J Clin Pharmacol 42:363–369

    CAS  PubMed  Google Scholar 

  15. Brockwell PJ, Davis RA (2002) Introduction to time series and forecasting, Second Edition. Springer-Verlag, New York

    Google Scholar 

  16. Guthrie D, Buchwald JS (1991) Significance testing of difference potentials. Psychophysiology 28:240–244

    Article  CAS  PubMed  Google Scholar 

  17. Grubbs FE (1969) Procedures for detecting outlying observations in samples. Technometrics 11(1):1–21

    Article  Google Scholar 

  18. Delorme A, Rousselet GA, Mace MJ, Fabre-Thorpe M (2004) Interaction of top-down and bottom-up processing in the fast visual analysis of natural scenes. Cogn Brain Res 19:103–113

    Article  Google Scholar 

  19. Deeks JJ, Altman DG (2004) Diagnostic tests 4: likelihood ratios. Br Med J 329:168–169

    Article  Google Scholar 

  20. Edgington ES (1967) Statistical inference from N = 1 experiments. J Psychol 65:195–199

    Google Scholar 

  21. Westfall PH, Young SS (1993) Resampling-based multiple testing. Wiley, New York

    Google Scholar 

  22. Davison AC, Hinkley DV (1999) Bootstrap methods and their application. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press, Cambridge

    Google Scholar 

Download references

Acknowledgments

Authors express gratitude to Professor doctor Richard Srebro for posing the challenge of an n = 1 statistical method, to Professor Peter Westfall who provided guidance during the early stages of this work, particularly in the understanding of statistical independence, resampling, and multiplicity corrections, and to Professor Joe Harrison who provided valuable feedback in earlier drafts of this paper. The authors are also grateful to Delorme, Rousselet, Mace, and Fabre-Thorpe (2004) for making their raw data accessible over the internet.

Software availability

To facilitate its use and encourage development of clinical applications, the computer software for the proposed method is being made available to individual scientists and clinicians. E-mail requests to rocky.young@ttuhsc.edu. Commercial use or distribution of the software, however, is strictly prohibited without written permission from the first author. The program was written in version 7.6 of Matlab® using functions from the Statistics and the Signal Processing Toolboxes.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rockefeller S. L. Young.

Appendices

Appendix 1: Statistic

The proposed statistic is similar to that for identifying outlying observations in a sample [17]. The statistic has the form,

$$ T_{l} = {\frac{{(y_{l} - \bar{y})}}{s}} $$
(1)

where the value y l represents the lth voltage element in the post-stimulus period. The parameters \( \bar{y} \) and s represent the ordinary mean and standard deviation of the pre-stimulus voltages, respectively. In brief, T represents the statistical transformation of the post-stimulus voltages y (Fig. 1, middle). The y voltages are converted into units of statistical deviation from the mean pre-stimulus voltage \( \bar{y} \). The scaling factor 1/s standardizes the deviations so that the magnitude of T would be always comparable regardless of the absolute voltage values.

Appendix 2: Description of two time-series methods

1. ARMA approach: Serially collected observations can be modeled mathematically using the classic autoregressive moving average or ARMA approach [15]. In VEP studies, the approach would assume that the spontaneous voltage fluctuations are determined not only by random perturbations occurring at each point in time but also by their effects that are carried forward from prior times. Mathematically, this can be described by,

$$ Y_{l} = \varepsilon_{l} + \Upsigma \varphi_{i} Y_{l - i} + \Upsigma \theta_{j} \varepsilon_{l - j} $$
(2)

where Y l is the voltage fluctuation and ε l is the random perturbation at latency l. The expressions Σφ i Y li and Σθ j ε lj are the integrated temporal persistence of previous voltages (the autoregressive part) and of previous random perturbations (the moving average part), respectively. The subscripts i = 1, 2,… p and j = 1, 2,… q refer to the number of previous observations in the series that have influence on the present observation. The parameters φ and θ describe the time-dependent statistical properties of the pre-stimulus voltages.

To derive the ARMA model from a pre-stimulus VEP, one has to estimate the model order (p, q) as well as the parameter values (ε, φ, θ) of Eq. 2. The method for estimating these is described in numerous textbooks [15]. In brief, the method uses an iterative process that is tedious and time consuming. Commercial software such as the professional version of Brockwell & Davis ITSM ® reduces the work but does not eliminate the complex decisions required by the task, particularly for users with little or no experience. The criteria for accepting the model order and the parameter values are discussed also in textbooks. Briefly, the criteria are based on diagnostic (time-series) tests that validate the randomness and normality properties of the derived ε values.

After the parameters of the best ARMA model are derived, pseudo VEPs can be created using commercial software such as Matlab®. In the present study, we generated the pseudo VEPs by entering the estimated values for φ, θ, and ε into Matlab® filter function. A different set of ε was used to create each instance of the pseudo VEP. Each set was a bootstrap sample of the original pool of ε values.

2. Phase-scrambling approach: As an alternative to ARMA, which is a time-domain approach, we propose using phase-scrambling, which is a frequency-domain approach [22]. The method decomposes the pre-stimulus VEP into its amplitude-frequency spectrum and phase relationships. Then, the phase values are “scrambled” to generate a new pseudo VEP. The term “scramble” refers to the random rearrangement in the sequence of the phase values. The original amplitude-frequency spectrum is retained. The algorithm is as follows:

Step 1: Compute the absolute magnitude M and the phase P of the pre-stimulus VEP voltages using the fast Fourier transformed sequence. M is the magnitude function of the frequency components (i.e., the spectral function). P is the corresponding set of phase values that range from −2π to +2π.

Step 2: Create a new sequence of phase values P* by scrambling the original sequence.

Step 3: Transform the frequency-domain data back into the time-domain by applying the inverse Fourier transform to the original magnitude M sequence and the scrambled P* sequence. The outcome of steps 1–3 is a pseudo VEP waveform.

Appendix 3: Derivation of the probability p

The algorithm for deriving the p values is based on multiple testing and multiple comparison methods used in the field of statistics [21].

Step 1: Transform the VEP voltages into T values by applying Eq. 1 to the pre-stimulus and post-stimulus voltages (Fig. 3).

Step 2: Transform the pseudo VEP values by also applying Eq. 1. Hereafter, the transformed pseudo VEP values will be designated as t* to distinguish them from T, the transformed VEP values. The t* values are illustrated by the multiple waveforms in the bottom row of Fig. 3.

Step 3: Collapse the t* values across its time dimension by selecting the maximum of the absolute value of t*. That is,

$$ t_{\max }^{*} = {\text{MAX}}(\left| {t^{*} } \right|) $$
(3)

where MAX and | | represent the maximum and absolute value operators, respectively. Remember that t* is a time sequence of values even though our notation does not make explicit the value at each time. The absolute value operator is used only when two-sided test of significance is desired.

Step 4: Repeat steps 2–3 a large number of times to acquire a distribution of \( t_{\max }^{*} \)values. To minimize simulation errors, we recommend repeating 10,000 or more times.

Step 5: Finally, derive the p value from T and the distribution of \( t_{\max }^{*} \). The algorithm for the derivation has the form,

$$ p_{l} = {\frac{{\# \left\{ {T_{l} \le t_{\max }^{*} } \right\}}}{nSim}} $$
(4)

where l is the post-stimulus latency, the numerator is the number of times that the value of T l is observed equal to or less than \( t_{\max }^{*} \) and nSim is the number of repeated simulations.

Two points are worth clarifying. First, the p value associated with T should be derived as described earlier (Eq. 4) rather than from probabilities tables found in textbooks. The previously mentioned method takes into consideration the time-dependent relationship that each voltage value has with other voltages in the subject’s own VEP. The probability values listed in textbook tables assume that the observations are purely random which is not the case in most electrophysiological recordings. Second, while significance testing can be performed using the voltages as data, the benefit of transforming the voltages into T and t* is the assurance that the distributional properties of the data for the actual VEP would be comparable to those for the pseudo VEP (Fig. 3). Specifically, the proposed statistical transformation insures that the data distribution would have comparable “scale” and “center”.

Appendix 4: Standard error of a mean time-series waveform

The theoretical formula for the standard error of a mean waveform, such as a VEP, is

$$ S_{sem}^{2} = {\frac{{\sigma^{2} }}{N}}W $$
(5)

where σ2 is the (waveform) voltage variance, and W is weighting function that takes into consideration the serially dependency among the observations in the time-series [p 58–59 in reference 15]. W is a function of the autocorrelation coefficients ρ (h), i.e.,

$$ W = \left[ {\sum\limits_{h = - (N - 1)}^{N - 1} {(1 - \left| h \right|/N)} \rho_{(h)} } \right] $$
(6)

where h is the lag, the position index in the data sequence. In practice, this formula has limited use because N must be very large in order to insure computational accuracy.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Young, R.S.L., Kimura, E. Statistical test of VEP waveform equality. Doc Ophthalmol 120, 121–135 (2010). https://doi.org/10.1007/s10633-009-9207-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10633-009-9207-4

Keywords

Navigation