Introduction

There has been significant recent interest in demonstrating differences in brain structure and function associated with low or moderate levels of alcohol consumption. Yet, the findings have been inconsistent. Whereas some investigators (Immonen et al. 2020; Daviet et al. 2022) have reported lower-than-expected brain tissue volumes among adults who drink at a low or moderate level, other investigators (Sachdev et al. 2008; Gu et al. 2014) find a difference in the opposite direction. In the brain function realm, there have likewise been a significant number of unexpected or unreplicated findings attributed to the effects of moderate alcohol exposure (Stampfer et al. 2005; Zhang et al. 2020).

In this confusing context of both negative and positive effects of moderate drinking among adults, one can suspect that the effects found among adolescents are equally inconsistent and confusing. Given their shorter drinking histories and limited access to alcohol, a robust and persistent effect of alcohol use—either negative or positive–seems unlikely. Indeed, recent reviews of the literature (Silveri et al. 2016; Carbia et al. 2018) find that young drinkers and control groups do not consistently differ on tests of working memory, visuospatial processing, and global cognitive function.

A few group differences are more reliably detected. They are most typically but not exclusively evident on cognitive tests that challenge the frontal brain and executive cognitive function. Also, they reach statistical significance in measurements of frontal beta electroencephalographic activity (Bauer and Hesselbrock 1993; Chabot et al. 2005; Porjesz and Rangaswamy 2007), P300 event-related electroencephalographic potentials (Almeida-Antunes et al. 2021), prefrontal cortex volume (De Bellis et al. 2005), and blood oxygen level dependent (BOLD) responses in several brain regions. Across most studies, including the cognitive and neuroimaging studies reviewed by Silveri and colleagues (Silveri et al. 2016), adolescent drinkers rarely differ from the non-drinking control group by more than one quarter of a standard deviation—a 90% overlap of the groups (Grice and Barrett 2014).

There are several factors that may explain these weak effects. One factor is the inconsistent recognition by researchers that minor decrements in neural and cognitive function among adolescents who consume alcohol may be related to genetic (Prom-Wormley et al. 2017), familial (Cservenka 2016), or personality (Winters et al. 2021) risk factors that predate and promote early-onset use and misuse. The presence or absence of a statistical effect of alcohol exposure may therefore be determined by the relative proportion of members in the affected and unaffected groups with a family history of psychopathology or a personal history of conduct disorder or attention deficit hyperactivity disorder. Evidence in support of this assertion can be found in studies of the P300 event-related potential wherein an amplitude reduction has been found among the biological offspring of alcohol-dependent parents (Elmasian et al. 1982; Begleiter et al. 1984). Subsequent studies have indicated that the amplitude reduction is most robustly detected among children who personally exhibit externalizing features (Bauer and Hesselbrock 2001; Iacono et al. 2003; Singh and Basu 2009). The contributions of other comorbidities, such as a personal (Houston et al. 2004) or family history (Zhang et al. 2007) of depression, have not been systematically explored which is surprising because externalizing and internalizing features are inter-correlated (Kim-Cohen et al. 2005; Willner et al. 2016; Caspi and Moffitt 2018).

Another factor contributing to weak statistical effects is the questionable sensitivity and validity of the conventional measurement approach. The issue can be understood with an analogy from electronics and an example wherein the goal is to detect and diagnose a complex (neural) circuit that is subtly disrupted and in the early stages of failure. In this example, the best indicator of the problem is not a sustained drop in the output voltage at the terminus. It is the appearance of dips, swells, and/or transients in voltage. Because these disruptions are brief, vary in phase or polarity, and may all occur in the same circuit at different points in time [similar to the variability seen in idiopathic generalized epilepsy (Seneviratne et al. 2012)], they cannot be detected by averaging the circuit’s output over time. They are more sensitively and validly indicated by the variability in the output—e.g., its standard deviation.

Findings from our laboratory have repeatedly demonstrated the value of this alternative approach. In prior studies, we have measured variability over time in brain activation and, specifically, inter-trial variability (ITV) in P300 event-related electroencephalographic responses (ERP) (Bauer 2021). Among adults, we have shown that P300 ITV is more sensitive to group differences than mean P300 amplitude. The ITV approach differentiates groups defined by the absence versus presence of externalizing disorders (Bauer 2018b, 2021; Bauer and Covault 2020), human immunodeficiency virus infection (Bauer 2018a, b), obesity (Bauer 2018b), or opioid dependence (under review). Among children, it modestly outperforms mean P300 amplitude in resolving effects of childhood conduct disorder and borderline personality disorder features (Bauer 2020). It demonstrates very good test–retest reliability (1-year intra-class correlation coefficient = 0.7, unpublished data) and is positively correlated with various task performance measures, including reaction time and reaction time variability (Bauer and Covault 2020).

The present investigation was built upon this premise of a subtle and sporadic alteration in brain function among young, recent drinkers of whom most report no resulting problems. Its three major aims were: (1) to demonstrate greater ITV in P300 amplitude among 14–19-year-old participants reporting any level of exposure to alcohol during the preceding 6 months in comparison to age-equivalent peers reporting no use; (2) to demonstrate a positive correlation between P300 ITV and familial risk factors preceding and often confounded with adolescent alcohol use; (3) to demonstrate no significant correlation of the ITV of P300 amplitude with levels of alcohol, cannabis, and nicotine use and thereby discount a pharmacological explanation as the proximate cause of the group difference.

Methods

Participants

The participants in this investigation were 99 males and 116 females between the ages of 14 and 19 years residing in the greater Hartford, CT region. They were recruited for an examination of personal and familial risk factors for alcohol and drug abuse. Various recruitment strategies were employed, including radio and print advertisements, appeals to parents enrolled in substance abuse treatment, and presentations to organizations serving troubled youth.

The parent of an interested volunteer was invited to telephone a research assistant and arrange an in-person visit to the Health Center. During the visit, informed consent and privacy agreements were reviewed and signed by the parent and his/her biological offspring. Next, the adolescent was taken to a separate room to complete an interview and questionnaires. The interview, the adolescent version of the Semi-Structured Assessment for the Genetics of Alcoholism (Bucholz et al. 1994), reviewed major Axis 1 and 2 psychiatric disorders in the DSM-III-R (American Psychiatric Association 1987) diagnostic system. Other assessments relevant to the present hypotheses were the Family History Assessment Module (Rice et al. 1995) completed by the parent as well as the Michigan Alcoholism Screening Test (Selzer 1971), Risk Taking Instrument (Busen 1991), Shipley Institute of Living Scale (Shipley 1940), Wide Range Achievement Test (Reynolds 1986), and a survey of alcohol and drug use during the previous 6 months completed by the adolescent participant. Adolescents were excluded from further participation if they reported a history of head injury with greater than 5 min of lost consciousness, seizures, life-threatening illness, schizophrenia or bipolar disorder, major medical diseases, or uncorrected visual or auditory deficits.

Procedures

ERP Data Collection and Processing

After the interviews and questionnaires were completed, the participant was escorted to an adjacent EEG laboratory where an electrode cap (Electro-Cap International, Eaton, OH) was fitted and applied. The cap located 31 EEG electrodes across the scalp. The technician also applied single electrodes to the bridge of the nose (reference) and the middle of the forehead (ground). A pair of electrodes was placed diagonally above and below the left eye for the detection of eyeblink and eye movement artifacts. Inter-electrode impedances were maintained below 5 kΩ.

Event-related potentials were then recorded during several cognitive tasks. The tasks selected for this analysis were two so-called oddball tasks (Donchin 1981) that delivered 75 db SPL auditory stimuli through headphones. During each task, 300 stimuli of 200 ms duration each were presented at a rate once every 2 s. Both tasks included a frequently occurring 500 Hz pure tone and an infrequently occurring 375 Hz tone which was the instructed target for a button press response. The second task, which was modeled after an experiment reported by Knight (Knight 1984), additionally included a prerecorded dog bark edited to match the intensity and duration of the other auditory stimuli. It served as a novel distractor. The relative ratios of frequent and rare stimuli during the tasks were 9:1 (2-stimulus oddball) and 8:1:1 (3-stimulus oddball), respectively.

During the tasks, a Grass Instrument Company Model 12 System was used to amplify the EEG (gain = 20 K) and eye movement (gain = 2 K) signals and apply a broad analogue bandpass filter (0.01–30 Hz). Along with a marker indicating stimulus and response onsets, the signals were sampled at 200 Hz by an A/D converter for 100 ms preceding and 600 ms following the onset of each rare stimulus. During off-line computations, these epochs were sorted by electrode and stimulus type.

The next step in data processing was to offset the epochs to align during the prestimulus period. Epochs with voltage deviations crossing a -/ + 30 microvolt threshold in the Fz, Cz, Pz, or eye movement channels were removed. This editing step yielded a minimum of 11 and a maximum of 30 artifact-free epochs as well as an average number (n = 22) that did not significantly differ between the control and alcohol-using groups.

To further discount the contribution of artifacts, a linear regression algorithm (Semlitsch et al. 1986) implemented in Scan version 4.3 software (Compumedics/Neuroscan, Inc., Charlotte, NC) mathematically removed eye movements and eyeblinks from each retained epoch. The artifact-reduction algorithm was followed by the application of an aggressive bandpass filter [0.1 Hz(12 db/octave roll-off)-8 Hz(24 db/octave roll-off)] and a second voltage offset to realign the epochs to their average pre-stimulus voltage.

The final stage of ERP data reduction involved the calculation of the average voltage and standard deviation of voltage at each data point spanning the rare target and rare distractor epochs. Averages and standard deviations were computed separately for Fz, Cz, and Pz epochs. P300 amplitude was the average voltage computed over an epoch window of 250 to 550 ms. To discount the statistical association of the average voltage of P300 and the ITV of its voltage, P300 ITV was calculated as the residual of the standard deviation averaged over the 250–550 ms window after a linear correction for across-trial P300 average amplitude.

Task Performance Data Processing

Button press responses during the tasks were monitored and summarized in the typical manner. An off-line program calculated the proportion of trials with correct responses as well as the mean reaction times for each task.

Data Analysis

The assignment of participants to groups differing in alcohol use relied upon a single survey question asking if any alcohol had been consumed during the prior 6 months. The background demographic, cognitive, psychological, and substance use characteristics of these groups were compared with simple ANOVAs for continuous variables and Pearson X2 tests for categorical variables. (Table 1)

Table 1 Background demographic, cognitive, psychological, and substance use characteristics of the groups [M(SD) or %(n)]

Analyses of ERP and task performance data added sex, age cohort (< vs. >  = 16 yrs), and race to the list of between subjects factors. To alleviate concerns about multiple comparisons and spurious findings, the first step in the analyses of the 18 ERP variates (Table 2) and 4 task performance variates (Table 3) involved omnibus MANOVAs to screen for overall significance (p < 0.05) of main effects or interactions. Univariate tests of main effects or interactions were performed if and only if the multivariate test for the effect was significant.

Table 2 Covariate-adjusted P300 inter-trial variability and average amplitude by group, electrode site, and task condition [M(SE)]
Table 3 Covariate-adjusted task performance outcomes sorted by group [M(SE)]

Two additional sets of analyses were performed to examine questions about dose–response and associations with known risk factors (Clark and Bukstein 1998) for adolescent alcohol use and misuse (i.e., other substance use, conduct problems, depression, and family history). These analyses used partial correlations to test associations of frontal P300 ITV, selected for its sensitivity to group differences in the prior analysis, with indicators of alcohol, cannabis, and cigarette use, risk taking, conduct problems, parental alcohol dependence, and parental major depressive disorder.

The final set of analyses was designed to move beyond the simple comparison of groups on P300 ITV and perform a more rigorous analysis testing it as a variable that can classify individual cases into their groups. These analyses of frontal P300 ITVs in the potentials elicited by target stimuli during the 2-stimulus oddball task and the novel distractor and target stimuli during the 3-stimulus oddball task began with a logistic regression that included age group as a covariate. Predicted probabilities were retained in the output from the regression analysis and subsequently entered into a receiver-operating-characteristic (ROC) analysis. Its area-under-the-curve (AUC) statistic was the basis for tests of classification accuracy.

Results

Background and ERP Data(Tables 1-2)

On average, the participants in this project were 16.5 years of age and completed an average of 10.1 years of formal education. Fifty-four percent were female. Sixty-one percent were white. (Table 1)

The univariate analyses of the background characteristics of groups distinguished by the presence versus absence of recent alcohol use revealed a few significant differences. Participants with a history of recent alcohol use were 0.8 years older [M(SD): 16.8(1.5) vs. 16.0(1.4); F(1,213) = 14.4, p = 0.001] and completed 0.6 more years of education [10.4(1.5) vs. 9.8(1.5); F(1,213) = 6.4, p = 0.01] than participants reporting no use. The former group also reported an elevated number of alcohol problems [3.5(6.0) vs 1.2(2.6); F(1,213) = 10.3, p = 0.002]. However, only 11.3% (n = 14) of the members of this group reported a problematic level of use, i.e., a MAST score ≥ 5.

There were relatively more recent users of alcohol (69.5%) than non-users (49.4%) who were white (X2 = 8.8, p = 0.003). Also, the prevalence of cannabis (61.7% vs. 17.2%; X2 = 41.64, p = 0.001) and regular cigarette (40.9% vs. 12.6%; X2 = 19.91, p = 0.001) use was greater among recent alcohol users. In addition, they reported more conduct disorder symptoms [2.6(2.4) vs. 1.6(2.6); F(1,213) = 10.6, p = 0.001] and higher scores on the Risk-Taking Inventory [18.9(10.2) vs. 12.1(13.1); F(1,213) = 17.1, p = 0.001].

The multivariate analyses of the ERP data revealed a significant effect of the alcohol use history factor [Pillai’s Trace = 0.14, F(18,182) = 1.70, p = 0.04]. Subsequent univariate tests of this factor detected greater ITV in P300 amplitude for most of the comparisons (Table 2) but not for comparisons on P300 mean amplitude (Figs. 1 and 2). The ERPs elicited by target stimuli showed greater P300 ITV at the Fz electrode [F(1,213) = 9.3, p = 0.003] during the 2-stimulus oddball task and at Fz [F(1,213) = 9.9, p = 0.002] and Cz [F(1,213) = 4.0, p = 0.047] electrodes during the 3-stimulus task. Analyses of the ERP elicited by the novel distractor during the 3-stimulus oddball task revealed significantly greater P300 ITV among recent users at all electrode sites [Fz: F(1,213) = 8.7, p = 0.003; Cz: F(1,213) = 4.8, p = 0.028; Pz: F(1,213) = 9.3, p = 0.003].

Fig. 1
figure 1

Event-related potential waveforms spanning -100 to + 600 ms from the onsets of rare target and novel distractor stimuli by electrode site. P300 is the prominent upward deflection between 250 and 550 ms

Fig. 2
figure 2

P300 inter-trial variability (top) and average amplitude (bottom) plotted as a function of group. * p < 0.05

Task Performance(Table 3)

Participants performed the tasks at a generally high level of accuracy. Yet, the variance tested by the MANOVA was sufficient to reveal a significant overall statistical effect of alcohol use on response accuracy [Pillai’s Trace = 0.09, F(4,196) = 3.40, p < 0.01]. In comparison to participants reporting no use, participants reporting recent use detected fewer target stimuli during the 2-stimulus [M(SE): 0.85(0.02) vs. 0.93(0.03); F(1,213) = 5.7, p < 0.01] and 3-stimulus [M(SE): 0.85(0.02) vs. 0.92(0.02); F(1,213) = 4.2, p < 0.04] oddball tasks. There were no significant group differences in reaction time.

Correlational Analyses(Table 4)

Table 4 Partial correlations of two family history indicators with P300 inter-trial variability by task condition

Because group differences in P300 ITV were most apparent at frontal electrode sites in the primary analysis, frontal P300 ITV data were the exclusive focus of the secondary analyses. These analyses tested the associations of Fz ITV data with two factors that have a recognized contribution to adolescent drinking and drug use: parental alcohol dependence and parental major depressive disorder. The correlations were computed across the full sample of 215 adolescents. The information presented in Table 4 shows modest but statistically significant correlations of Fz ITV’s with a parental history of depression but not with a parental history of alcohol dependence. The correlations controlled for age, race, and sex.

Additional correlations were computed using all participants between P300 ITV and background variables that differentiated the primary study groups: any use of cannabis during the previous 6 months, any cigarette use for greater than 1 month, the risk taking score, and conduct disorder symptoms. None of the correlations were statistically significant at alpha level of 0.05. However, a few of the correlations approached significance. For example, cannabis use was weakly associated with target stimulus P300 ITV during the 2-stimulus (rp = 0.132, p = 0.064) and 3-stimulus oddball (rp = 0.139, p = 0.051) tasks. Also, there were weak associations of target stimulus P300 ITV during the 2-stimulus oddball task with cigarette use (rp = 0.128, p = 0.071) and the risk taking score (rp = 0.127, p = 0.075). No other correlations had corresponding p-values less than 0.1.

Another analysis evaluated the statistical impact of the frequency of drinking over the preceding six months on P300 ITV at the Fz electrode site. It omitted the 87 participants who reported no drinking and divided the drinkers into subgroups who reported alcohol use 1–4 times, 3–4 times, once per month, 2–3 times per month, and at least weekly. The ANCOVA with age, race, and sex as covariates revealed no significant association.

ROC analyses

Tests of the ability of P300 ITV to classify individual participants into groups defined by their recent use of alcohol were statistically significant. For P300 ITVs measured in response to target stimuli presented during the 2-stimulus and 3-stimulus oddball tasks, the respective AUC statistics were AUC = 0.634 (SE = 0.039, p < 0.001) and AUC = 0.651 (SE = 0.038, p < 0.001). The classification accuracy of P300 ITV measured in the response to the novel distractor during the 3-stimulus oddball task was less impressive but statistically significant: AUC = 0.589, SE = 0.040, p = 0.028.

Discussion

This article began with a brief review of studies of both adults and adolescents that associate light-to-moderate alcohol use with differences in brain structure and function. It noted that there is contradictory evidence suggesting either detrimental or beneficial effects. It asserted that explaining and resolving these inconsistencies will not be simple.

The findings of the present investigation highlight the complexity of the problem. They demonstrate that adolescents assigned to a group on the basis of a self-report of any alcohol use during the prior 6 months also reported more cannabis and cigarette use during this period. Adding another layer of complexity is the aforementioned evidence of minor decrements in neural and cognitive function associated with genetic (Prom-Wormley et al. 2017), familial (Cservenka 2016), personality (Winters et al. 2021), and psychological risk factors that predate and promote early-onset use and misuse. In support of this statement, the present study found that recent alcohol users acknowledged more risk-taking and conduct disorder behaviors and a marginal elevation (p = 0.09) in symptoms of depression (Table 1).

One of the most significant contributions of the present study to the literature is its analysis of P300 inter-trial variability. Across two separate oddball tasks, the analysis demonstrated greater P300 amplitude variability over time but an equivalent P300 average amplitude among alcohol-using youth in comparison to the control group. The finding suggests that ITV provided a level of sensitivity not apparent in conventional analyses focused on the across-trial average.

Our demonstration of the superior sensitivity of P300 ITV versus P300 average amplitude is buttressed by other findings from our laboratory. We (Bauer 2020) have recently reported that teenagers with conduct disorder or borderline personality disorder features, which are risk factors for early onset substance use, exhibit greater levels of P300 ITV than their unaffected peers. We have similarly reported that adults with high levels of impulsivity attendant to drug abuse (Bauer 2021), an obese body mass (Bauer 2018a), or HIV-1 infection (Bauer 2018a) demonstrate greater inter-trial variability in P300 and other event related potentials.

A final notable contribution of the present study was its demonstration of a correlation between a parental history of depression and elevated P300 ITV at the frontal electrode site. This association is interesting for several reasons. First, there is small literature demonstrating associations between a personal history of depression and an impaired ability to sustain attention indicated by increased reaction time variability (Kaiser et al. 2008) and a distorted perception of time passage (Thönes and Oberfeld 2015). The present P300 ITV finding suggests that the association may generalize to a family history of depression or, alternatively, reflect a synergism between personal and family histories of depression. Second, there is a large literature implicating the frontal brain in both the orienting response to rare stimuli and the maintenance of attention (Zamani et al. 2021). It is therefore logical that group differences in P300 ITV are more prominent at the frontal site and particularly in response a stimulus, the novel distractor, known to evoke the frontally generated P300a subcomponent (Knight 1984). Third, the correlation of P300 amplitude ITV with familial depression but not with familial alcoholism raises the possibility that P300 amplitude variability is associated with a different category of familial risk factors than is P300 average amplitude. We have previously shown that depression and familial alcoholism affect frontal brain function differently by enhancing activity in different EEG frequency bands (Bauer and Hesselbrock 2002). Our group and other groups have repeatedly linked familial and personal histories of alcohol dependence and externalizing disorders (Polich et al. 1994; Bauer 1997; Costa et al. 2000; Patrick et al. 2006; Yoon et al. 2006) to a reduction in P300 average amplitude.

Limitations

This article cannot conclude without discussing a few limitations. Prominent on the list of concerns and limitations is the possibility that P300 amplitude was imprecisely measured on individual trials. P300 ITV may include subtle artifacts such as minor eye movements accompanying eyeblinks. We attempted to minimize these artifacts by aggressively filtering the data to separate P300 frequencies from other frequencies. We also employed both artifact rejection and artifact correction algorithms.

The second limitation pertains to the presentation of tasks that were easily performed. One could argue that a reduction in mean P300 amplitude would have emerged among recent alcohol users if the tasks were more challenging. But, the failure to detect a group difference in the mean amplitude level should not overshadow the demonstrated sensitivity of P300 ITV. It is also noteworthy that robust demonstrations of P300 amplitude reductions among people with familial or personal risk factors for alcohol misuse are largely confined to cognitively challenging tasks that present visual stimuli (Polich et al. 1994; Patrick et al. 2006). Reductions in mean P300 amplitude are rarely detected among the members of these groups, while they perform tasks that present auditory stimuli.

The third limitation is our definition of alcohol exposure. Indeed, it should be acknowledged that many investigations, including the present investigation, use arbitrary operational definitions of alcohol consumption that may affect the results. Missing from the neuroimaging and electrophysiology literatures is a comprehensive assessment of the relative predictive values of different alcohol quantity and frequency indices tabulated over current, past, peak, and lifetime drinking periods as well as the presence/absence of binges and withdrawal episodes.

The final noteworthy limitation is the age of the data set. It remains possible that these adolescents, who were recruited and studied between 1994 and 1999, may not be representative of adolescents in the present era because substance use patterns, risk taking behavior, and mental and physical health backgrounds may have changed. An obvious factor potentially contributing to this change is the recent COVID-19 pandemic.

Conclusions

The present investigation was designed to demonstrate altered brain activation among adolescent who reported any level of alcohol use during the preceding 6 months. It succeeded in this goal. However, it did not and cannot unquestionably attribute the group difference to alcohol use because our sample was complicated by coincident use of other substances as well as preceding factors such as conduct problems, risk-taking, and a marginal elevation in depression symptoms. In fact, the indicator of altered brain activation that differed across groups was significantly correlated with the presence of a preceding and potentially predisposing factor—a family history of depression. One could argue that the next goal should be to dissect out the individual contributions of each of these variables. However, meeting the goal comes with the risk of misrepresenting the complex reality. It is also a non-trivial statistical challenge to ensure that there is adequate variability in each covariate to effectively remove its contribution. A larger study would be helpful. But, an increased sample size does not guarantee a valid finding.

A separate conclusion pertains to the most innovative aspect of the present study—its examination of the inter-trial variability of P300 amplitude. In the present analysis, it was more sensitive than the across-trial average of P300 amplitude to group differences. Indeed, the relatively greater sensitivity of P300 ITV than P300 average amplitude is consistent with findings in the reaction time ITV literature, wherein reaction time variability has been shown to outperform average reaction time. Reaction time variability has also outperformed average reaction time in predicting several clinically important outcomes including all-cause mortality (Batterham et al. 2014) and frailty and fall risk (Graveson et al. 2016) among others. Future studies should test the sensitivity of P300 ITV as a prospective predictor of these and other clinically significant outcomes, including progression to substance dependence as well as relapse.