Development of central auditory processes in Polish children and adolescents at the age from 7 to 16 years

There are discrepancies in the literature regarding the course of central auditory processes (CAP) maturation in typically developing children and adolescents. The purpose of the study was to provide an overview of age – related improvement in CAP in Polish primary and secondary school students aged 7–16 years. 180 children/adolescents, subdivided into 9 age categories, and 20 adults (aged 18–24 years) performed the Dichotic Digit Test (DDT), Duration Pattern Test (DPT), Frequency Pattern Test (FPT), Gap Detection Test (GDT) and adaptive Speech-in-Noise (aSpN). The 12-year-olds was retested after w week. We found the age effects only for the DDT, DPT and FPT. In the right ear DDT the 7-year-olds performed more poorly than all groups ≥12. In the left ear DDT both 7- and 8-year-olds achieved less correct responses compared with the 13-, 14-, 15-year-olds and with the adults. The right ear advantage was greater in the 7-year-olds than in the 15-year-olds and adult group. At the age of 7 there was lower DPT and FPT scores than in all participants ≥13 whereas the 8-year-olds obtained less correct responses in the FPT than all age categories ≥12. Almost all groups (except for the 7-year-olds) performed better in the DPT than FPT. The test-retest reliability for all tests was satisfactory. The study demonstrated that different CAP have their own patterns of improvement with age and some of them are specific for the Polish population. The psychoacoustic battery may be useful in screening for CAP disorders in Poland.

Despite that the course of age-related changes in CAP reflects the CANS maturational process that appears to be independent of linguistic and cultural background of the person whose auditory performance is being examined, the dynamics of improvement, observed in the psychoacoustic tests, during childhood, varies from one to another country (e.g. Kelly, 2007;Mattsson et al., 2018;McDermott et al., 2016;Romero-Díaz, Peñaloza-López, García-Pedroza, Pérez, & Castro Camacho, 2011;Stollman et al., 2004). This may result from both a great diversity of procedures/ stimuli used to examine particular process (many authors developed their own tasks and batteries, e.g. Cameron et al., 2006;Cameron & Dillon, 2008;Fuente & McPherson, 2006;Mukari, Keith, Tharpe, & Johnson, 2006) and different home languages of children referred for evaluation (Bao et al., 2013;Dawes & Bishop, 2007;Marriage, King, Briggs, & Lutman, 2001;Woods, Peña, & Martin, 2004). Sometimes even an accent (Dawes & Bishop, 2007;Loo, Bamiou, & Rosen, 2013) or learning a second language (Weiss & Dempsey, 2008) may affect the outcomes. Therefore, to ensure the clinical utility of psychoacoustic tests, the baseline population scores should be determined for the same population from which the child at risk of CAPD comes from.
The aim of the present study is twofold: 1) to determine the course of agerelated changes in behavioral CAP tests performance in Polish children and adolescents aged from 7 to 16 years, as well as 2) to provide the reference values and evaluate the clinical validity of the auditory processing battery proposed. To our knowledge not many papers have focused on the development in more than two processes (Mattsson et al., 2018;McDermott et al., 2016;Moore et al., 2011;Neijenhuis et al., 2002) and the maturation effect has been rarely investigated in children older than 11-12 years Krizman et al., 2015;Moav et al., 2009).
The current work complements, to the some extent, our previous paper (Włodarczyk, Szkiełkowska, Skarżyński, Miaśkiewicz, & Skarżyński, 2019) presenting the reference values for selected psychoacoustic tests (those that require recognition of duration and frequency patterns, dichotic listening or understanding of time-compressed speech) for children aged 7-10 years. The CAP development was not the main object of interest in this paper, however, the comparisons of psychoacoustic tests scores revealed some significant age effects. Specifically, the 7-year-olds performed more poorly than the 9-and 10-year-olds in the duration pattern test whereas both 7-and 8-year olds were worse than the older age groups in the dichotic listening and time-compressed speech tasks. In the latter two tests the 7-and 8-year-old as well as the 9-and 10-year-old groups were not significantly different. The age effect for the frequency patterns was not determined due to a high variability of the outcomes.
Since it has been repeatedly suggested that the CANS maturation is not finished until the age of 12, in the present study we intend to investigate how auditory performance of typically developing Polish children changes up to the age of 16 Krizman et al., 2015;Ludwig et al., 2014). The psychoacoustic battery, used by Włodarczyk et al. (2019), was supplemented with the test measuring temporal resolution which should be included to screening for CAPD (DeBonis, 2015;Musiek et al., 2010) due to its sensitivity for the brainstem lesions . Furthermore, the compressed speech test, used in the previous study, was replaced with the speech-in-noise because the latter has been especially recommended to be administered as a part of the CAP assessment, right after a child history, medical examination and pure tone audiometry (Association (BSA), 2018).
We hypothesized that the performance on the dichotic listening and auditory pattern tests would increase gradually with age up to 16 years. Both types of tasks involve the CANS structures (the auditory cortex and the corpus callosum which mature into early adulthood, 2) the outcomes of these CAP tests are strongly affected by non-auditory, cognitive functions such as attention or working memory (both of them require keeping in mind the sequences of acoustic elements for a while before they are reported) with their own developmental course until the age of 16 (Cowan, 2016;Gomes, Molholm, Christodoulou, Ritter, & Cowan, 2000;Karns, Isbell, Giuliano, & Neville, 2015).
Naturally, the development of dichotic listening and auditory pattern recognition in children has been already demonstrated in many countries (Dekerle & Meunier, 2018;Kelly, 2007;Mattsson et al., 2018;McDermott et al., 2016;Neijenhuis et al., 2002;Schochat & Musiek, 2006;Stollman et al., 2004). However, home language of children and adolescents, participated in our study, is different from all of them which allows us to believe that the dynamics of age-related changes in these tests and the moments when the scores reach the adult-like level are also unique for the Polish population. This prediction refers to the specificity of material used to investigate dichotic listening. For example, in contrast to the English version of the test where all digits are mono-syllabic, in Polish more than a half of digits from 0 to 9 is bi-syllabic which requires from the listener to report longer sequences, with a higher cognitive effort. Furthermore both dichotic digit pairs and nonverbal auditory patterns are requested to be reported orally. In the dichotic listening task the way of responding probably would not be as important as in case of the auditory pattern tests where the listener is asked to assign some labels to each type of the sequence (e.g. "longshortlong" or "highlowhigh". Especially young children may find difficult to learn naming the tones as "high" or "low" (according to our clinical experience the "thin" or "thick" labels are much more natural and easier to remember). Therefore, we hypothesized that both specificity of verbal stimuli and the differences in the way of formulating responses in these tests would significantly affect the course of age-related changes in dichotic listening and auditory pattern recognition tests performance across the age range considered.
In the dichotic listening task the right ear advantage (REA), which represents the left hemispheric dominance for language (Hugdahl, Andersson, AsbjØrnsen, & Dalen, 1990;Kimura, 1961aKimura, , 1961bKimura, , 1967, are most likely observed, especially in younger listeners (Kimura 1961;Mattsson et al., 2018 ;Rosenberg, 2011). Inconsistent results are reported in the literature concerning how the REA changes with age (Mattsson et al., 2018;Moncrieff, 2011;Mukari et al., 2006;Rosenberg, 2011;Westerhausen et al., 2011). Westerhausen et al. (2011) found parallel differences in the REA for the dichotic syllables task and thickness of the isthmus (a posterior part of the corpus callosum) between the 6-and 8-year-olds. Moncrieff (2011) documented the REA presence in about 60% of children aged 5-7 years, 7 5% of 8-to 10-year-olds and 70% of 10-to 12-year-olds. Mukari et al. (2006) have shown significant increase of REA (for the dichotic digit test in Malay language) only from the age of 6 to 7 (3% to 7%) whereas in all older groups (up to 11 years) these values were comparable. The REA decrease from about 12-13% in children, aged 6-7 years, to about 6% in the 11-12-year-olds was also found (Mattsson et al., 2018;Rosenberg, 2011). The aforementioned discrepancies in the course of agerelated REA encouraged us to investigate this effect in Polish population.
Previous findings by Włodarczyk et al. (2019) made us also wonder if Polish children indeed recognize the sequences of sounds that are different in duration better than the frequency patterns (which was neither reported nor discussed in this paper). This observation is not exactly congruent with the other studies where children either outperformed the frequency over duration patterns (McDermott et al., 2016), or the results of both these tests are comparable (Mattsson et al., 2018;Neijenhuis et al., 2002;Romero-Díaz et al., 2011;Stollman et al., 2004). Musiek (1994) suggested that the frequency and duration patterns tests measure different abilities. The DPT uses only one frequency which makes it relatively unaffected by the cochlear damage but, instead, it is thought to be more sensitive to the CANS lesions compared to the FPT (Musiek, Baran, & Pinheiro, 1990;Scharlock, Neff, & Strominger, 1965). Although these tests are very similar, it is reasonable to expect that there would be different course of their improvement with age in children.
We hypothesized that for Polish children the frequency patterns would be more difficult to recognize than those formed by the tones of different durations possibly due to all the following factors combined: 1) specific intonation patterns in Polish language: we do not have the "highlowhigh" or "lowhighlow" nuclear tones which may result in a greater difficulty in recognizing the frequency compared with the duration patterns; 2) inadequate music education in Polish schools and 3) a tendency to use different that "high" and "low" labels for the tones while reporting the sound sequences, observed especially in younger children.
Existing evidence on age-related changes in gap detection performance in children and adolescents are inconsistent. The vast majority of studies have revealed that the adult-like level in these tests is reached by the age of 7 years (do Amaral et al., 2013;Ismaail, Shalaby, & Ibraheem, 2019;Kelly, 2007;Mattsson et al., 2018;McDermott et al., 2016;Shinn, Chermak, & Musiek, 2009) independently of the procedure applied (Chermak & Lee, 2005). However, an improvement in detecting gaps in noise up to 10-11 years (Buss, Porter, Hall, & Grose, 2017;Irwin et al., 1985) or in the 7-10 age range with a greater variability of the results in younger than older kids, has been also demonstrated (Lister et al., 2011). In the present study we predicted that our version of gap detection test would produce relatively stable scores over the age range considered. However, we did not completely rule out a possibility that there would be some improvement among children older than 7 years, first because an ability to detect gaps in noise relies on the auditory brainstem function which, according to some findings, changes even until early adulthood Skoe et al., 2015) and, second, impaired performance on this test was linked to dysfunction of the auditory cortex (Efron et al., 1985) which could be not completely matured even in adolescents (Paus et al., 1999).
We also expected that there would be no considerable agerelated improvement between 7 and 16 years in the speech-innoise intelligibility test. Our prediction was based on previous studies demonstrating stable results in this test from the age of 7 (Keith, 2000;Mattsson et al., 2018;McDermott et al., 2016;Neijenhuis et al., 2002;Stollman et al., 2004;Wilson, Farmer, Gandhi, Shelburne, & Weaver, 2010). As was the case with the auditory gap detection, we developed our version of speech-in-noise test, with Polish mono-syllabic words presented against a multitalker babble. Since the unique verbal material was used here, the reference values for particular age groups in children and adolescents are needed.
It is also thought that poor temporal resolution contributes to reduced speech understanding in noise (Stuart et al., 2006). When speech is exposed against background sounds, the rapid temporal changes of them mix with those inherent in the speech signal. As a result, temporal cues in speech may be distorted (Cooke, 2006;Lutman, 1991). Therefore, we expected that in our study there would be either a parallel improvement in the speech-in-noise and gap detection tests or in both of them the values would be stable across the age range considered.
To verify the above hypotheses children in the 7-16 age range were divided into 9 age categories and their outcomes in particular CAP tests were compared with each other and also with the results achieved by young adults (to determine when each auditory process reaches the adult level). The reference values for Polish battery were provided as well as the reliability measures of the psychoacoustic tests in the group of 12year-olds.

Method
Subjects A group of 180 normally developing children and adolescents (92 girls and 88 boys, mean age = 11,43 ± 2,62 years, age range: 7-15,9 years), recruited from 3 primary and 2 secondary schools in Toruń (< 200,000 inhabitants), Warsaw (< 2,000,000) and in the rural areas in the vicinity of Warsaw, as well as 20 young adults (11 women and 9 men, mean age = 20,5 ± 1,8 year, age range: 18-24 years) who responded to an advertisement in the local press, participated in the study. All participants were native speakers in Polish.
Children were classified into nine age categories (see: Table 1). All subjects had normal hearing in both ears with pure tones thresholds ≤15 dB HL at the octave frequencies from 250 to 8000 Hz and type A tympanograms. They were all right-handed (Oldfield, 1971) and had intellectual abilities within the normal range. 1 Other information was provided by parents/caregivers and teachers of the children and adolescents as well as by the adults themselves. All subjects had no history of neuropsychiatric disorders or head trauma and did not take any medications affecting the Central Nervous System. Children and adolescents were varied in terms of socioeconomic status, attended school regularly, were in a good health, and did not have any recognized developmental disorder (e.g. dyslexia, SLI, ADD/ADHD, ASD) or a risk of it. Participants who received a formal music education, which might have affected the CAP tests performance, were excluded from the study.

Ethics Statement
Prior to testing parents/caregivers of the children/adolescents and the adult subjects provided a written informed consent to participation in the project. The study was approved by the ethics committee of the Institute of Physiology and Pathology of Hearing, Warsaw/Kajetany, Poland, and is in accordance with the Declaration of Helsinki for research on humans.
The authors were not provided with the data enabling the identification of the subjects. The study was conducted by 3 persons, trained in CAP evaluation with the use of the psychoacoustic tests in children, and under appropriate acoustic conditions.

Procedures
Children and adolescents were tested individually in quiet rooms in their school buildings whereas tasks for adults were conducted in the laboratories of the Institute of Physiology and Pathology of Hearing, Warsaw/Kajetany, Poland. Three experimenters, trained in the area of CAP assessment, collected the data and they were instructed to administer and score the tests in the same manner. The CAP tests, applied in this study, were developed as a result of scientific cooperation between the World Hearing Center, Institute of Physiology and Pathology of Hearing (Poland) and the Department of Communication Disorder Brigham Young University (USA). The tests were administered using a Notebook HP Probook 4510S computer running Microsoft XP Professional. Auditory stimuli were presented via headphones (Sennheiser HDA 200) using a Creative SB1-100 sound card. Each of the tests was preceded by a training session to familiarize subjects with the procedures.
A set of tests referring to the following aspects of audition were administered: dichotic listening (Dichotic Digit Test, DDT), temporal processing (Duration Pattern Test, DPT, Frequency Pattern Test, FPT and Gap Detection Test, GDT) as well as performance with degraded speech (adaptive Speech-in-Noise, aSpN). The pure tone audiometry and CAP evaluation were performed during a single ca. 1-h session (with short breaks). The order of CAP tests was counterbalanced across subjects.
In order to provide some measures of test-retest reliability of our CAP battery, the group of 12-year-olds (n = 20) was tested twice, in about a week apart (mean = 7.25 days ±1.86).

Dichotic Listening
During the DDT (Musiek, 1983) children were presented with a sequence of two different digits (from 1 through 10) in the left ear at the same time as they were given a sequence of two different digits in the right ear and they were asked to repeat the digits from both ears. There were 40 digit pairs (20 pairs per the ear). The stimuli were presented at 60 dB HL. Percentages of correctly reported digits, both separately from the left and right ears as well as the difference in performance between the right and the left ear in DDT (the right ear advantage, REA 2 ) was calculated.

Temporal Processing
The FPT (Pinheiro & Ptacek, 1971) consisted of 40 triplets of 180-ms sine wave tones (rise/decay time of 10 ms) of either low (880 Hz) or high (1122 Hz) frequency presented binaurally at 60 dB HL. Each triplet consisted of one, two or three tones of low or high frequency with an inter-tone-interval (ITI) of 200 ms. The task was to orally report the order of the tones, e.g. "highlowhigh". The sequences within a test were presented in pseudo-random order. The percentage of correct responses were analyzed.
The DPT (Musiek et al., 1990) used 40 binaural 3-elementsequences of 1000-Hz sine wave tones (rise/decay time of 10 ms) differing in duration and presented with an ITI of 300 ms. The tones were either short (250 ms) or long (500 ms) and were presented at 60 dB HL. Subjects were asked to repeat the order of the tones within a sequence (e.g. /short/ -/long/ -/long/). The percentage of correct responses was again analyzed. Age (yearsyears; months) 7-7;11 8-8;11 9-9;11 10-10;11 11-11;11 12-12;11 13-13;11 14-14;11 15-15;11 18-23;11 Females 11  10  10  9  10  10  12  10  10  92  11  103   Males  9  10  10  11  10  10  8  10  10  88  9  97 T o t a l 2 0 2 0 2 0 2 0 2 0 2 0 2 0 2 0 2 0 1 8 0 2 0 2 0 0 The GDT measured the shortest length of a silent gap embedded in white noise required for perceiving and reporting the gap. The stimulus was 500-ms white noise presented to both ears at 50 dB HL. An adaptive procedure was applied searching for the length at which there was a 50% chance of detecting the noise with a gap and 50% of a noise without a gap (Leek, 2001). The task was to press a button in response to a gap embedded in noise. In this test the minimal gap duration was determined in a two-stage procedure. In the first stage stimuli with varying gap durations were presented. The initial gap duration was 10 ms and it decreased or increased by half of its length depending on the correctness of the subject's response. This part of the test was continued until a subject failed three times to detect a gap of the same duration. This gap duration was then applied in the test proper and it was adjusted according to the individual subject's performance, i.e. it increased by 2-ms following a false alarm (a button press in the absence of a noise with a gap) or a miss (no reaction to a gap stimulus) and decreased by 2 ms after a hit (a correct gap detection). The test was terminated after 7 reversals. A reversal was defined as a hit followed by a miss (or a false alarm) or a miss (or false alarm) followed by a hit. The average of the 5 most difficult reversals gave the minimum gap duration.

Speech-in-Noise
In the aSpN single-syllable Polish words (Harris, Nielson, McPherson, Skarzynski, & Eggett, 2004) were successively presented against the background of 16-talker babble speech. The task was to repeat the words. The words were delivered to both ears using different signal-to-noise ratios (SNRs). The initial (maximum) SNR was 9 dB and the minimum was −15 dB. Negative SNRs indicate that the background noise is louder than the target word and positive values correspond to a situation where a target word is louder than the background noise. An adaptive procedure was applied in which, initially, the SNR decreased by 4 dB after each correct response. From the moment a subject did not respond correctly for the first time, the SNR increased by 2 dB following each incorrect word and decreased by 2 dB following each correctly repeated word. aSpN measures the minimum SNR required to correctly recognize words 50% of the time. The calculations were performed in line with the Wilson and McArdle approach (Wilson & Burks, 2005) and were based on the 5 most difficult reversals, i.e. correctly repeated words followed by an incorrect one (or lack of a response) or incorrect answers followed by a correct one. The test concluded after 7 reversals.

Statistical Analysis
The statistical analyses were performed with SPSS 27.0 software (SPSS Inc., Chicago, Illinois) and R (R Core Team, 2020). To check for normality of variables distributions in each group the Shapiro-Wilk's test was conducted and Levene's test was used to verify whether the variances of age groups were homogeneous. The results of the GDT and aSpN and also the REA index showed abnormal distributions which might be caused by too much persons with similar scores. In case of these variables the non-parametric Dunn test and the Kruskal-Wallis H tests were used for comparisons between two or more age categories, respectively. The results were corrected for multiple comparisons (Bonferroni).
Since the distributions were normal but the variances of most variables in particular age categories remained heterogeneous, we used the Welch one-way ANOVA which is recommended when the assumption of equal variances in comparing groups is not valid. The post-hoc comparisons were performed with the use of Games-Howell test that also does not assume equal variances between compared groups. Finally, bootstrapped paired t tests (n = 1000) were applied to determine the differences in the DDT scores between the right and left ears as well as between the DPT and FPT performance in each age category.
Test-retest reliability for each of the CAP tests was examined by the intra-class correlation coefficient (ICC) (McGraw & Wong, 1996) for a group of 12-year-olds retested after ca. 7 days (mean = 7.25 days ±1.86). ICC values and their 95% confident intervals were based on a mean-rating (k = 2), consistency, 2-way random-effects model. For a categorical description of the level of reliability, we followed the suggestions by Koo and Li (Koo & Li, 2016): ICC > 0,9 indicates "excellent" reliability, ICC between 0,75 and 0,9 "good" reliability, ICC between 0,5 and 0,75 indicate "moderate" reliability and the values <0,5 correspond to "poor" reliability. Additionally, since the distributions in all tests in this group did not significantly deviate from normality, the series of bootstrapped paired t-tests (n = 1000) were used to check for the learning effects.
The age-related changes in particular CAP tests performance are shown in Fig. 1. Table 2 contains the descriptive statistics for each age group whereas the detailed results of post-hoc comparisons are shown in Tables S1-S5.

Ear Effect in the DDT
There was significantly higher percentage of correct responses for the right ear compared to the left ear in all analyzed age categories (Fig. 1, Table 3).

Duration Vs. Frequency Patterns
In almost all age categories, except for 7-year olds, the duration patterns were recognized more correctly than frequency patterns (see: Fig. 1 and Table 3).

Test-Retest Reliability
The intra-class correlation coefficients (ICC) for particular psychoacoustic tests as well as the test-retest differences between the first and second CAP evaluation, are presented in Table 4. The reliability of individual performance between the first and second test session for the 12-year-old children was considered as "good" for the DPT and DDT (left ear) and "moderate" for the remaining CAP tests. Only the GDT performance significantly (p = 0,03) improved after about a week period of time.

Discussion
In the present study we sought to determine the dynamics of changes in performance of psychoacoustic battery in typically developing Polish-speaking children and adolescents, aged from 7 to about 16 years. Since we used our own versions of the tests, we considered important to know how their outcomes change over time. The current study complements, to some extent, our previous findings (Włodarczyk et al., 2019) with: 1) the results of two psychoacoustic tests (measuring temporal resolution and speech-in-noise understanding) recommended for CAPD evaluation (Association (ASHA), 2005; DeBonis, 2015; Musiek et al., 2010), 2) the reference values for the age groups over 10 years and 3) the measures of CAP test-retest reliability.

Summary Results
We found significant age effects for the DDT, DPT and FPT but not for the aSpN and GDT performance (Fig. 1). The comparisons between particular age categories indicate that the 7-, 8-, 9-,10-and 11-year-old children were indistinguishable in terms of the right ear DDT results and only the 7-yearolds performed worse in this test compared with all groups above 11 (Table S1). The DDT scores for the left ear were lower in both 7-and 8-year-olds than in the age categories above 10 (or above 12, after taking account the correction for multiple comparisons). The 9-, 10-, 11-and 12-year-olds Adults showed significantly lowered scores than the adults but, with the Bonferrroni adjustment, only children at the of 9 showed a trend towards less correct responses in this test compared with the adult group (Table S2). All age categories showed the right ear advantage (REA) which was higher in the 7-8 age range than in both 15-year-olds and adults (Table S3). In the DPT the 7-year-olds performed more poorly than the 13-, 14-, 15year-olds and the adults but there was also a trend towards lowered scored at the age of 7 compared with the age of 10-11 (Table S4). Relative to all age groups above 10, the FPT performance in both 7-and 8-year-olds were decreased, however, after the adjusting for multiple comparisons, only the differences between the age of 7 and the groups older than 12 as well as between the 8-year-olds and the age categories above 11 reached the significance level (Table S5). The percentage of correct responses in the DPT was significantly higher than in the FPT in all age groups except for 7-year olds whose performance was comparable in both these tests ( Table 3). The test-retest reliability, assessed by calculating the intra-class correlation coefficients (ICC) based on two CAP evaluations in the 12-year-olds, could be considered as "good" for the DPT and DDT (left ear) and "moderate" for the DDT (right ear), FPT, GDT and aSpN (Table 4, Koo & Li, 2016).

Dichotic Listening Improves in Polish Children Aged 7-13 Years
In the present study the DDT outcomes for the right ear became adult-like already at the age of 10 ( Fig. 1, Table S1) and even the 7-year-olds performed very well in this test (about 75% correctness level, Table 2). Therefore, a slight improvement across the age range considered, may reflect accidental fluctuations of attention/working memory rather than the CANS maturation process. The lack of significant differences in the right ear DDT outcomes between the age categories above 7 years were probably due to the ceiling effects. The left ear DDT performance across 7-12 age range was still lower compared with the adults and the 13-year-olds achieved the adult-like scores (Fig. 1, Tables 2, S2). These findings indicate that the development of dichotic listening in Polish children is completed until 13 years of age. Basically, our outcomes are consistent with previous reports (Kelly, 2007;Mattsson et al., 2018;McDermott et al., 2016;Neijenhuis et al., 2002;Pedersen, Dahl-Hansen, Christensen-Dalsgaard, & Brandt, 2017;Stollman et al., 2004), however, most authors investigated the improvement in listening of dichotic digits in children younger than 12 years arguing that CAP continue to develop until this age and the results of older groups are relatively stable and comparable to those obtained by adults (Keith, 2000;Bellis, 2003;Kelly, 2007;Mattsson et al., 2018;McDermott et al., 2016, Schochat & Musiek, 2006. Our DDT results are comparable with those provided by Neijenhuis et al. (2002) in children (aged 7-12 years), adolescents (14-16 years) and adults (in both right and left ear the values in the latter two groups were almost the same but the 12-year-olds still performed worse compared to them). The course of the left ear dichotic listening development up to 13 may result from both continuous maturation process of the auditory cortex/corpus callosum (Moncrieff, 2011) and developmental improvement in non-auditory cognitive skills (Stavrinos et al., 2018;Tomlin, Dillon, Sharma, & Rance, 2015).
Despite that our DDT results in children aged 7-10 years ( Table 2) were comparable with those previously reported in the Polish population (Włodarczyk et al., 2019), in contrast to this study we failed to demonstrate significantly lowered left ear scores in the 7-8-year-olds relative to the 9-and 10-yearolds as well as better right ear performance in 10-year-olds than at the age of 8 years. These discrepancies may result from different statistical methods used to analyze the data.
We observed the greatest improvement in the left ear DDT with age, i.e. from 60% to 69% of correct responses, between 8 and 9 years (Fig. 1, Table 2). A similar, 10%-difference between 8-and 9-year-olds, was also reported by other authors (Mattsson et al., 2018;Pedersen et al., 2017;Włodarczyk et al., 2019). The course of age-related improvement in the DDT may also reflect developmental changes in divided attention since the measures of both these functions found to be inter-correlated (Stavrinos et al., 2018). It is possible, then, that the DDT captures the moment of the highest  Cameron et al., 2016;Kelly, 2007;McDermott et al., 2016). Interestingly, the youngest age groups in our study (7-and 8-year-olds) appear to achieve less correct responses in the left ear DDT compared with their peers from other countries (Cameron et al., 2016;Kelly, 2007;McDermott et al., 2016;Pedersen et al., 2017). This observation could be explained by the fact that, unlike to the test versions used in the aforementioned studies, the digit sequences to repeat in the Polish task, were longer (not only mono-but also bi-syllabic words were included). Therefore, our task might have higher cognitive demands which could affect the results, especially in younger listeners. Perhaps it would be advisable to include to the Polish psychoacoustic battery the dichotic test containing other verbal material than numbers.

REA Effect
In our study the DDT scores in the right ear were significantly better than those obtained in the left ear in each age group including young adults (Table 3). The large REA for dichotic verbal material (e.g. digits) has been relatively well documented in normally developing children (e.g. Kelly, 2007;Kimura, 1961aKimura, , 1961bMattsson et al., 2018;McDermott et al., 2016 ;Weihing et al., 2015), especially in young listeners (Hugdahl et al., 1990;Kimura, 1961aKimura, , 1961b, and is considered as being indicative for the immaturity of CANS (less myelination) or greater suppression from the dominant ear (Moncrieff, 2011;Musiek & Weihing, 2011).
Considering the course of changes in the REA with age, the right ear dominance in our study was reduced from about 20% in the 7-year olds to about 5% and even less in the 13-year  olds and adults, respectively (Fig. 1, Table 2). Our data are comparable with those found in Norwegian children (Mattsson et al., 2018) who showed the REA decline from about 20% (7-year olds) to 6% in the 11-12-year-olds. In English-speaking children in USA (McDermott et al., 2016;Weihing et al., 2015) or New Zealand (Kelly, 2007) for whom a decrease of the REA from 3 to 10% in the 7-8 year-olds to the ceiling effect in children older than 11 was observed, the right ear dominance for the dichotic digits appears to be smaller than in Polish children in each age category between 7 and 12 years. Since the larger REA indexes have been reported for higher linguistic material (Hugdahl et al., 1990;Kimura, 1961aKimura, , 1961b it is possible that digits in Polish language are more complex stimuli that those in English (in the Polish version of the test 6 of 10 digits are bi-syllabic words) and this may lead to higher REAs. Furthermore, in contrast to, e.g. Mattsson et al. (2018), where the handedness of participants was not controlled, in the current work only the results from right-handed persons were reported which may also partially explain the increased REA values in our study.

Duration and Frequency Pattern Recognition Improve at Different Rate in Children Aged 7-13 Years
Similar to most previous reports (Dekerle & Meunier, 2018;Kelly, 2007;Mattsson et al., 2018;McDermott et al., 2016;Neijenhuis et al., 2002;Schochat & Musiek, 2006;Stollman et al., 2004;Weihing et al., 2015) we found significant age effect in the auditory patterns performance (Fig. 1, Table S4-S5). Comparable with the previous studies (e.g. McDermott et al., 2016;Pedersen et al., 2017) in the youngest groups for both DPT and FPT scores there were large standard deviations which systematically decreased with age. Since the developmental variability of cortical responses is thought to be indicative for sustained attention (Strait, Slater, Abecassis, & Kraus, 2014) and, for example, FPT performance was thought to correlate with academic skills and real-life listening difficulty (Tomlin et al., 2015), the age-related improvement (both higher correctness level and lowered variability of the scores) in children in acoustic pattern recognition may just reflect development of general cognitive functions (attention, working memory). Furthermore, an ability to recognize the temporal patterns is needed to extract and use prosodic aspects of speech such as rhythm, stress and intonation (Fletcher, 2010). Therefore, the changes of the FPT and DPT performance, reported here, may also reflect an increase in linguistic area. The DPT and FPT results in our study were comparable with those provided by Włodarczyk et al. (2019) and, as was the case with the DDT, unlike the results demonstrated in this paper, we did not find any significant differences between the 7-year-olds and 9-10-year-olds in the DPT (probably due to different statistical analysis). In the previous paper the age effect in the FPT was not explored because of a huge variability of the scores. In the present study we also showed large standard deviations in this test outcomes, especially in the youngest groups, however, these values were very similar to those found by other authors (Kelly, 2007;Mattsson et al., 2018;McDermott et al., 2016;Neijenhuis et al., 2002) and not high enough to stop us from analyzing the developmental changes in frequency pattern recognition.
According to our expectations, the FPT turned out to be more challenging than the DPT for all age categories (see: Table 3 for the results of direct comparisons between DPT and FPT scores in each group) which is not quite consistent with the results of other studies (Mattsson et al., 2018;McDermott et al., 2016;Neijenhuis et al., 2002;Romero-Díaz et al., 2011;Stollman et al., 2004). The DPT scores improved the most in the 7-10 age range, from 60% to 81% (Table 2) whereas in the FPT performance increased from ca. 53% of correct responses in 7-year old to ca. 80% in the 11and 12-year-olds ( Fig. 1, Table 2). The greatest progress in the FPT was observed between 7 and 9 years of age (from 53% of correct responses to 63%) and also between the age of 10 and 11 (from the correctness level of 61% to ca. 72%). Since in our study in both these tests the progress between subsequent age categories was relatively small, the significant differences were found only between the 7-year-olds and the groups above 12 (DPT) and in case of the FPT between the age of 7-8 and the participants above 11-12. The DPT performance became adult-like much earlier than the FPT (at the age of 10 and 13, respectively).
The highest discrepancy between the duration and frequency pattern scores was found in the 10-year-olds, i.e. lower correctness level and increased variability of test results in the FPT (Fig. 1, Table 2). As we have mentioned in the Introduction section, in the present study Polish children, mostly the younger ones, definitely prefer to verbalize the tones differently, e.g. thin and thick despite that they were encouraged to respond using high and low labels. On the other hand, in the DPT there is only one way to report the order of the sounds, i.e. using short or long labels. Since psychological studies clearly demonstrate that metaphors in language can shape people's nonlinguistic space-pitch mental representations (Dolscheid, Shayan, Majid, & Casasanto, 2013) and around the age of 10 years children just start to understand and make sense of metaphors (Vosniadou, 1987) it is possible that lowered and more variable FPT performance reflect uneven development of an ability to comprehend and produce metaphorical language, observed especially at the beginning of this process. Therefore, the reason why Polish children performed better in the DPT than FPT might be a difficulty in understanding and verbalization of responses in the latter test.
Another possible explanation of this discrepancy is that a lower correctness level of the FPT in Polish children results from poorer music education (most public schools in Poland do not pay much attention to the child's musical skills development). This is even more likely when we consider that all participants who attended additional music classes were excluded from our study. The FPT results, comparable with ours, were found in Spanish-speaking children in Brazil (Schochat & Musiek, 2006) where there is no music education system. Thus, we cannot rule out a possibility that Polishspeaking children, adolescents and adults did not deal so well with the frequency patterns because of insufficient music training.
The observed differences between the duration and frequency pattern results could be also explained with a reference to a specificity of Polish language. One of distinctive features of Polish is a length of consonants which is crucial for intelligibility and expressiveness of the language (Nau et al., 2016). Since Polish children are trained in duration discrimination while listening to and using Polish every day, they might also achieve better scores in the DPT.
Children in our study were less correct in the FPT compared with their English-speaking peers who obtained ca. 65-70% of correct responses at the age of 7 years and ca. 88-91% when they were 11-12-year old (Bellis 2003;Kelly, 2007;McDermott et al., 2016;Weihing et al., 2015). However, in contrast to the aforementioned studies where 3-element auditory sequences were delivered monaurally (albeit there were no significant between-ear differences), we used bilateral tone triplets. Therefore, different stimulus presentation mode could partly account for the discrepancy in the scores achieved in this test by English-and Polish-speaking children. This effect may also arise from the differences in intonation between these languages: some structures of nuclear pitch patterns are present in English but absent in Polish, e.g., in contrast to English, in Polish "low-high-low" or "high-low-high" are not present (Demenko, 1999) and, therefore, the perception of auditory patterns composed of high and low tones might be not natural for Polish children.

Gaps-and Speech-in-Noise Detection Are Already Adult-like in 7-Year-Old Children
Gaps-in-noise detection is a commonly used procedure to investigate temporal resolution, relatively not much affected by attention or working memory of the listeners, and with proved clinical utility (Efron et al., 1985;Musiek et al., 2005). In the present study we used the GDT with an adaptive algorithm to determine gap detection thresholds. Therefore, prior to using this test at the clinic, it is highly recommended to determine how the results of this test change with age in normally developing children and adolescents, especially when it has been suggested that the threshold values may be affected by the stimulus parameters and procedure (Chermak & Lee, 2005;Phillips, Comeau, & Andrus, 2010). The GDT results in our study (Fig. 1, Table 3) were relatively stable with the age range considered and comparable with those of other authors (do Amaral et al., 2013;Bellis 2003;Irwin et al., 1985;Ismaail et al., 2019;Mattsson et al., 2018;Shinn et al., 2009). It suggests that the mechanism responsible for encoding temporal aspects of auditory information is well developed even in the youngest school-age children. In our study to correctly detect a gap embedded in white noise, the minimum gap length ranged from approximately 3,9 ms in 7-year-olds to about 3,8 ms in 13-year old and 3 ms in young adults was needed (Table 2) revealing a slight but inconsistent improvement in the GDT with age. In most studies in normal populations the auditory gap detection thresholds were about 3-4 ms (e.g. Irwin et al., 1985;Ismaail et al., 2019;Mattsson et al., 2018), with higher values being indicative for the temporal lobe lesions (Musiek et al., 2005) that are not supposed to be found in normally developing children.
Our procedure and way of calculating the results are the most similar to those of the test developed by Lister, Roberts, and Lister (2011), i.e. the Adaptive Tests of Temporal Resolution (ATTR), where the gap detection thresholds are determined with the use of an adaptive algorithm. Unlikely to our study, they found considerable improvement in this test within the age range from 7 to 12 years. The discrepancy of the results between the AATR and our test may arise from different procedures applied (in the ATTR the listeners were exposed to the pairs of noise bursts and had to decide which of them contained a larger gap whereas in the GDT we asked the listeners to attend a series of white noises and press the button when those with embedded gap occurred). Furthermore, our study differs from Lister et al. (2011) with respect to the properties of stimuli applied and the way of calculating the results. Other well-known tools are the Random Gap Detection Test (Keith, 2000) that actually measures the auditory fusion threshold (the averaged interval at which the tone pairs, separated by a silent gaps, are perceived as two with the interval at which the tone pairs are perceived as one) and the Gaps-in-Noise (GiN) (Musiek et al., 2005) which consists in counting and reporting orally the number of gaps heard in white-noise segments. Although both these tests are much different from the task in our study, the outcomes and dynamics of their improvement in children up to the age of 12 years are comparable to those that we observed. This is in accordance with the conclusion made by (Chermak & Lee, 2005) that different versions of gap detection test produce comparable results.
According to our prediction, both auditory gap detection and speech-in-noise tests scores remained unchangeable in the age range considered. Basically, our outcomes are congruent with other studies where discrimination abilities for speech sounds appeared to be developed at relatively young ages (up to about age of 8 years) and older children did not differ from young adults in this ability (e.g. Keith, 2000;Neijenhuis et al., 2002;Wilson et al., 2010). We found the largest, but not statistically significant, decline of the SNR value between 10-and 11-year olds. We considered this effect as being caused by accidental fluctuations od attention rather than developmental changes of auditory brainstem where the ability to hear speech in the presence of background noise is represented (Song, Skoe, Banai, & Kraus, 2011).We did not expect that our, relatively easy, test with the single words exposed against background noise, would be sensitive enough to capture these subtle developmental changes. Furthermore, since there are no evidence on the relationship between speech-in-noise performance and cognitive abilities in typically developing children aged 5.5-13 years (von Koss Torkildsen, Hitchins, Myhrum, & Wie, 2019), a slight improvement in the aSpN in our study resulting from other reasons than the brainstem maturation process, may be also excluded.
In the present study an adaptive algorithm to determine the minimum SNR ratio, needed to repeat words correctly, was applied. In other versions of the task, words with the same SNR are exposed to a child and a target stimulus is of greater intensity than the background noise (Keith, 1995). In this case, the number of correct responses for particular SNR is calculated. In our opinion, however, determination of the minimum SNR value, using an adaptive measurement procedure, can provide more precise information about an individual's speech perception ability and be more useful at the clinic. Despite different versions of speech-in-noise test used worldwide, with different verbal material, consistently with most previous findings, we found no significant improvement of speech-in-noise intelligibility with age from 7 to 16 years. Therefore, the aSpN may be considered as a part of CAP evaluation.

Test-Retest Reliability
As it is shown in Table 4, ICCs for all CAP tests appear to be satisfactory. The lowest values were found for the adaptive procedures, the GDT and aSpN. Furthermore, only in the gap detection task there was a considerable learning effect. These results may suggest that the adaptive algorithms, used in the GDT and aSpN to determine the threshold values, should be modified to provide more consistent scores. However, since re-testing was performed for only one age group (12-year-olds) we refrain from any general conclusions about that.

Limitations and Further Directions
We are aware of the fact that age groups studied here could be bigger to ensure greater reliability of the reported effects.
While interpreting the outcomes one should take into account that due to higher variability of the test outcomes in younger than older participants, some differences between the age groups (as small as in our study), may not reach the significance level. Evaluation of the same subjects at particular age moments (e.g. every year from 7 to adulthood), similarly as Stollman et al. (2004) did, might provide more throughout insight into a course of CAP performance improvement with age. Furthermore, all tests, applied in the present study, have already been recommended as possible screening for CAPD (Iliadou et al., 2017;Jerger & Musiek, 2000) in clinical practice. To make them useful at the clinic in Poland it would be advisable to determine the age-appropriate normative data representative for Polish-speaking population. It would be also desired to take into account how cognitive development of the child (e.g. attention and language) affects the improvement of particular CAP performance with age in children and adolescents.

Conclusions
Our study demonstrated the improvement in dichotic listening and recognition of frequency and duration patterns across the age range from 7 to 13 years. The DPT performance was adult-like much earlier (in 10-year-olds already) than the DDT and FPT outcomes where the adult levels were not achieved until the age of 13. The gap detection and speechin-noise intelligibility were adult-like and relatively stable between 7 and 16 years. Thus, particular auditory processes evolve at different rates, consistently with previous findings.
In comparisons with other studies, the development of dichotic listening and the ability to recognize frequency patterns appears to be slightly delayed in Polish children. The course of age -related improvement in the tests measuring these auditory processes may be depended on the linguistic/ cultural background of the person being examined and the rate of development of his/her non-auditory cognitive abilities (attention, language). Since the linguistically-loaded psychoacoustic tests (dichotic listening, speech-in-noise comprehension) do not measure "pure" central auditory processes, their results should be interpreted with caution. It is recommended to consider using more accurate measures of auditory processing even in case of dealing with nonverbal sounds. Unification of the response method (e.g. humming) would allow to compare the frequency pattern performance between Polish and non-Polish speaking children.
Unlike to some populations, the frequency patterns were more difficult to recognize for Polish children and adolescents than the sequences of sounds differing in duration. This effect may be explained in terms of a specificity of Polish language (the lack of nuclear accent types that are present, e.g. in English), insufficient musical education or quite misleading form of response format in the FPT (the labels for low and high tones, required in this test, do not come naturally to the Polish children).
Our study also provided the reference values for the psychoacoustic tests in both children and adolescents, aged from 7 to 16 years, complementing those previously found for Polish children up to 10 years of age (Włodarczyk et al., 2019). We also added to the Polish psychoacoustic battery two tasks measuring important aspects of auditory processing, i.e. temporal resolution and speech-in-noise intelligibility. Since all the tests achieved satisfactory reliability, they could be recommended to use for CAPD screening.
Acknowledgements This study was partially supported by the grant of the Polish National Science Centre no. UMO-2013/08/W/HS6/ 00333, entitled: "NeuroPerKog: development of phonematic hearing and working memory in infants and children".
Authors' Contributions M.L. designed the study, analyzed the data and wrote the manuscript in consultation with R.M.
M.G. J.D. and E.W. collected the data and helped shape the manuscript.
H.S. supervised the study.
Data Availability The data being reported here are available from the corresponding author on reasonable request.
Code Availability Not applicable.

Declarations
Conflicts of Interest/Competing Interests On behalf of all authors, the corresponding author states that there is no conflict of interest. The authors alone are responsible for the content and writing of the paper.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.