Background

Obesity is considered as a public health problem with significant comorbidities. The increase in obesity is troubling, especially in underdeveloped and developing countries, where the rate has increased by 40% in the last 30 years. Serious comorbidities accompany obesity, such as musculoskeletal, metabolic, cardiovascular, endocrine, and psych emotional diseases [1].

The steps of voice production include the air exhaled from the lungs, modulated in the vocal fold, and is modified in resonating cavities as the pharynx, oral, and nasal cavities, as well as by structures such as the lips, tongue, and palate. Normal voice attributes are [2] loudness (subjective sensation of intensity) adequate to the surroundings, pitch (subjective sensation of frequency) appropriate to age and gender, pleasant quality (pertaining to resonance and glottic noise), free of random noise, flexibility with variations in emphasis, and significance and subtleties (with interactions of the previous attributes). Dysphonia can be defined as a symptom characterized by “a difficulty in emitting voice with its natural characteristics” [3]. Perceptual–auditory assessment is used as a traditional and routine clinical evaluation of vocal quality [4, 5].

Obesity currently constitutes a public health problem. Evaluation of the vocal behavior of obese patients has become a subject of interest, due to the physical–pathological modifications related to obesity that compromise virtually all of the body’s systems [6].

The value of computerized acoustic vocal analysis has been constantly recognized, since, besides providing qualitative data, it also allows for a quantitative analysis of vocal parameters [7]. Usually correlations between acoustic and perceptual assessments are moderate. This is to be expected since several individual, cultural and social factors influence listeners’ perceptual ratings as well as perceived perceptual relevance of various aspects of the signal and the limitations of our hearing system. However, since voice and voice quality are perceptual by nature, perceptual voice characteristics have greater intuitive meaning than many instrumental measures [8].

Acoustic analysis of voice is quantitative, non-invasive as well as cost and time efficient. It is possible to distinguish between normal and abnormal voice or even suggest the pathology. Acoustic parameters commonly used are fundamental frequency, shimmer, jitter, and noise to harmonic ratio. Fundamental frequency (F0) is defined as the number of times the vocal folds repeats produces a sound wave during a given period of time, it is measured in Hertz [9]. Jitter is defined as cycle to cycle variation of frequency, and shimmer is defined as cycle to cycle variation of the amplitude of the sound wave [10]. Noise to harmonic ratio is based on the fact that the spectrogram of a normal voice show well developed harmonics (harmonics are the integer multiple of fundamental frequency). Dysphonic voice fails to show strong harmonics. So, this ratio is the ratio of acoustic energy of stable harmonics to that of noise [11].

According to hypotheses that explained the relationship between body weight and voice production [12, 13], the relation between obesity and voice lies in the affection of excessive body weight on abdominal breath support for voice production. In extreme cases, resonance can be affected by obesity due to a significantly reduced pharyngeal lumen [13]. The hypothesis of the study that voice might be affected due to obesity, so earlier detection and management should be followed

The objective of the study was to analyze the impact of obesity in children on selected parameters of voice objectively by voice analysis (fundamental frequency, jitter, shimmer, and noise to harmonic ratio) and subjectively by auditory perceptual assessment and to compare these results with age- and gender-matched controls and to show is age had a role in their voice analysis. This analysis may help us to include strategies for voice disorders prophylaxis and management (if needed).

Methods

Subjects

This cross-sectional study included 30 pre-pubertal children aged from 4 to 12 years with a mean of 7.63 years (Table 1) with nutritional obesity (with BMI > 3 SDS). The exclusion criteria include patients who have history of voice abuse or gastro-esophageal reflux GERD. Patients with other causes of obesity such as endocrine or syndromic obesity, associated chronic conditions such as diabetes or allergic diseases, or had a history of prolonged medications such as steroids; in addition, patients who had reflux symptoms or any signs of puberty or a history of voice disorder or voice abuse. The strict exclusion criteria aimed to avoid confounds that eliminates factors (other than weight) that could affect voice production. Patients were recruited from the obesity clinic of Diabetes, Endocrine and Metabolism Pediatric Unit (DEMPU) during the period from May to December 2017. Thirty age- and sex-matched normal weight healthy children were included as a control group and were recruited from the general outpatient clinic and had not any conditions that affect voice such as respiratory, allergic, or oto-laryngeal diseases.

Table 1 Descriptive analysis of age regarding each group and between-group comparisons—N = 60, n controls = 30, n cases = 30

Methodology

All children were subjected to the following protocol of assessment.

Elementary diagnostic procedures

  1. 1.

    Patient interview thorough history taking and clinical examination laying stress on the patient’s age, gender, history of any chronic and allergic diseases, chronic medications as steroids, history of voice disorder, or voice abuse in addition to any symptoms of reflux disease. All children were subjected to complete physical general examination. Written consents from their Guardian was taken to participate in the study

  2. 2.

    Auditory perceptual assessment (5): done by two experts

figure a

Clinical diagnostic aids

Flexible laryngoscopy was done for assessment of cases who had dysphonia.

Additional instrumental measures

Acoustic analysis

Acoustic analysis was performed using the Computerized Speech Lab CSL model 4500 (Kay Elemetrics). It consists of the external module with high-speed dual channel 16 bit A/D and D/A, the plug-in board with two high-speed digital signal processing integrated circuits, high-quality microphone, type Shure, studio quality speakers and the software which provide all necessary tools for acquiring, analyzing and playing speech signals with excellent fidelity. CSL with acoustic analysis software imports and analyzes signal data with any sampling rate. The input and output digital anti-aliasing and analog anti-aliasing support a wide range of sampling rates and frequency. Each subject is asked to produce a sustained vowel /a/ at comfortable frequency and amplitude levels and the signal is picked up by a dynamic microphone placed 20 cm in front of the patient's mouth. Assessment of all parameters was done in acoustically treated room. The following values were obtained:

  1. 1-

    The fundamental frequency (F0) in Hz

  2. 2-

    Jitter.

  3. 3-

    Shimmer.

  4. 4-

    Noise to harmonic ratio. (N/H)

Statistical analysis

Descriptive analyses

  1. (a)

    Continuous variables were described in terms of mean, median, standard deviation (SD), and range according to each group (i.e., age and voice parameters).

  2. (b)

    One categorical variable was described in terms of frequencies and percentages (i.e. gender).

Testing for normality

For testing normality of the data, the Shapiro-Wilk test for normality was applied to choose the proper comparative and correlation analysis tests.

Comparative analysis

  1. (a)

    For normally distributed data, the parametric Welch’s t test was applied to assess between group differences regarding numerical data (i.e., age).

  2. (b)

    The non-parametric Mann-Whitney U test was applied for data not normally distributed (i.e., voice parameters).

  3. (c)

    Pearson’s chi-squared test was used to assess differences regarding categorical data (i.e., gender).

Correlation analysis

  1. (a)

    To assess the relationship between age and voice parameters that are normally distributed, Pearson’s correlation coefficient was applied.

  2. (b)

    To assess the relationship between age and voice parameters that are not normally distributed, Spearman’s correlation coefficient was applied.

Results

The study included 30 obese children: 13 males (43.33%) and 17 females (56.67%), with a mean age of 7.63 ± 2.51 years (ranged from 4 to 12 years). The control group included 30 non-obese children; 14 males (46.67%) and 16 females (53.33%), with a mean age of 8.12 ± 1.68 (ranged from 4.08 to 11 years). Both groups were age- and sex-matched (p value 0.39 and 1.0, respectively).

Demographic characteristics

Table 1 shows descriptive analysis of age regarding each group and between-group.

  • Age: mean age of the controls is 8.12 (± 1.68) years which is higher than that of the cases (7.63 ± 2.51 years); this difference is statistically insignificant (p value = 0.386).

Table 2 shows descriptive analysis of gender regarding each group and between-group comparisons.

  • Gender: 47% of the control group are males and 53% are females; while in the cases group, 43% are males and 57% are females. The difference between both groups is statistically insignificant (p value = 1).

Table 2 Descriptive analysis of gender regarding each group and between-group comparisons

Table 3 shows Shapiro-Wilk test for normality for variables in both treatment groups.

Table 3 Shapiro-Wilk test for normality for variables in both treatment groups

Auditory perceptual assessment results

Auditory perceptual assessment revealed dysphonia grade 1 in 40% and grade 2 in 25% of the children under study. Dysphonia was strained, low pitch, and of decreased loudness, associated laryngeal functions were not affected.

Flexible laryngoscopic examination

Flexible laryngoscopic examination in patient who had dysphonia revealed phonatory gap in three patients in spite of absent history of voice abuse.

Voice parameters by CSL

Table 4 shows comparison of the numerical predictors between cases (N = 30) and controls (N = 30)

Table 4 Comparison of the numerical predictors between cases (N = 30) and controls (N = 30)

Fundamental frequency

  • Mean fundamental frequency in the controls is 275.5 (± 65.59) Hz which is higher than that of the cases (259.6 ± 43.43 Hz); this difference is statistically insignificant (p value = 0.959).

Mean fundamental frequency

  • Mean for mean fundamental frequency in the controls is 263.91 (± 55.61) Hz which is higher than that of the cases (252.7 ± 39.14 Hz); this difference is statistically insignificant (p value = 0.728).

Jitter

  • Mean Jitter in the controls is 0.84 (± 0.41) Hz which is lower than that of the cases (2.04 ± 1.28 Hz); this difference is statistically significant (p value < 0.001).

Shimmer

  • Mean shimmer in the controls is 0.68 (± 0.48) dB which is higher than that of the cases (0.52 ± 0.27 dB); this difference is statistically insignificant (p value = 0.105).

Noise to harmonic ratio

  • Mean noise to harmonic ratio in the controls is 0.23 (± 0.37) which is lower than that of the cases (0.51 ± 0.45); this difference is statistically significant (p value < 0.001).

Correlations

Table 5 shows correlation analysis between the age and the numerical predictors in the study group.

Table 5 Correlation analysis between the age and the numerical predictors in the study group

Age with:

  • Fundamental frequency: a significant inverse correlation appeared in the control group. (As age increases, fundamental frequency decreases.)

  • Mean fundamental frequency: a significant inverse correlation appeared in the control group.

  • Jitter: A significant direct correlation appeared in both control and case group. (As age increases, jitter increases.)

  • Shimmer: Insignificant correlations in both control and case group.

  • Noise to harmonics: insignificant correlations in both control and case group.

Discussion

Auditory perceptual assessment that shows dysphonia (in 18 cases) grade 1 in 11 cases and grade 2 in 7 cases of the children under study. Flexible laryngoscopic findings in patients who had dysphonia revealed phonatory gap in 3 patients in spite of absent history of voice abuse

Among studies compared the auditory perceptual analyzes of voice between the obese group and the control group and revealed worse scores in the obese group [14, 15]. They can be explained by difficulty in producing and maintaining airflow from the anatomical adverse conditions, such as adipose accumulation in the abdominal and cervical circumference. In addition, the pharyngeal tissues flaccidity that causes a reduction of the glottic cycle [14, 16].

Absent structural pathology in larynx except for phonatory gap in three cases in spite of presence of significant acoustic analysis changes (Table 4) and dysphonia in 60% of cases could be explained by that dysphonia may precede visible laryngeal structural pathology seen by laryngoscope as explained in a study of Welham and Maclagan (2003) [17].

There is significant difference between cases and controls as regards jitter and noise to harmonic ratio with higher values in cases than in controls that revealed presence of voice problem (Table 4); these finding could be explained by the effect of obesity on the laryngeal tract configuration. Similar finding was reported in a study revealed that the obese group has an altered voice pattern [18]. Another study explained voice changes in obese patients by that obesity changes vocal performance causing abnormal fat deposition in the abdominal region and upper airways, especially in the posterior and lateral walls of the pharynx, soft palate, uvula, and posterior region of tongue [14]. A high prevalence of modification of the respiratory pattern is found in obese people for several reasons, involving the increased amount of upper airway mass. This modified respiratory pattern in addition to changes in pharyngeal size and configuration may therefore affect the acoustic parameters in obese individuals [19].

Correlation of voice parameters with age revealed positive correlation with jitter and negative correlation with fundamental frequency F0, The lack of homogeneity in the results of voice analysis in children shows the importance of doing additional studies with larger samples.

Conclusion

It can be stated that the voice of children with morbid obesity shows significant modifications pertaining to vocal characteristics in comparison to non-obese persons. Further studies are needed for finding new approaches to identify factors that can explain why obese persons have characteristically different voices than those non-obese.