1 Introduction

The ability of humans to communicate is an essential characteristic that distinguishes us from other living beings and has played a crucial role in the evolution of the dominant species. People suffering from voice disorders have a reduced quality of life and today also poorer chances of professional success [4, 26, 42]. However, voice problems do not only affect the individual person, but are also of high social and economic relevance. A study in 2000 revealed that the annual losses within the Gross National Product of the USA are up to 186 billions dollars since approximately 10 percent of the entire US population are affected by communication disturbances [29].

The principal mechanism of voice generation is the self-sustained oscillation of the vocal folds (VFs) caused by the airstream coming from the lungs (see Fig. 1). Due to the vibrating vocal folds, the airstream is interrupted periodically generating acoustic waves dominated by the oscillation (fundamental) frequency \(f_{0}\). The arising primary voice source is subsequently modulated by the shape of the downstream oral and nasal airways referred as vocal tract.

Fig. 1.
figure 1

Physiology of human voice production (Color figure online)

In general, the purpose of numerical models of the human voice is either to gain a deeper insight into voice generation or a clinical application. Thereby, simVoice become a valuable clinical tool in phoniatrics by

  • contributing to a better understanding of pathological and physiological voice production,

  • helping to identify new treatment approaches,

  • predicting the outcome of conservative and surgical voice treatment, and

  • supporting training of medical stuff in phoniatrics.

For a patient specific application in a clinical environment, special focus in the development of the model simVoice must be put on computational efficiency, which is achieved by (1) applying the hybrid aeroacoustic approach based on the perturbed convective wave equation (PCWE) and (2) by circumventing fluid-structure interaction (FSI) modeling by prescribing the vocal fold motion (forward coupling).

Figure 2 illustrates the application of simVoice in a clinical environment as a tool supporting decision making, as well as for finding new treatment approaches. In the first step, the vocal fold motion is measured by high-speed imaging, and the voice of the patient is recorded. Furthermore, the geometry of the larynx and the vocal tract are obtained by magnetic resonance imaging (MRT) and the geometry and VF motion of the phonation model simVoice is adapted to the patient specific physiology. Subsequently, the geometry of the numerical model is modified according to the chosen treatment approach (e.g. surgical treatment), which is thought to yield an improvement of voice quality. In the next step, the voice of the modified geometry is computed by the numerical model. If the resulting voice quality is satisfying, the selected treatment can be performed. Otherwise, the geometry is modified again until a satisfying result is obtained. Thus, an efficient numerical model capturing the crucial physical aspects of phonation will contribute to a more effective treatment of voice disorders.

Fig. 2.
figure 2

Workflow of a prospective clinical application

The rest of the paper is organized as follows. In Sect. 2, the state of art and a classification of numerical phonation models is provided and in Sect. 3, the methodology of simVoice is presented according to the simulation workflow. The model of the larynx and the vocal fold motion is illustrated in Sect. 3.1 and the hybrid workflow is explained in Sect. 3.2 including the derivation of the governing equation for flow-induced sound generation and propagation. Subsequently, the steps of the hybrid workflow, namely the computational fluid dynamics simulation, aeroacoustic source term computation and interpolation, and computation of the acoustic field are presented within Sect. 3.2, before analyzing the computational efficiency of the phonation model (Sect. 4.1) and showing validation results (Sect. 4). Finally, a conclusion including an outlook are provided (Sect. 5).

2 Background

The interaction of fluid dynamics (airflow), structural mechanics of the VFs (tissue), and acoustics (air), which is the principal mechanism of voice production, can be described in different levels of complexity. On a fundamental level, models of human phonation can be distinguished between lumped mass models and models based on continuum mechanics with the respective governing partial differential equation (PDE) according to Fig. 3. For the latter approach, the governing PDEs are typically solved by a finite volume method (FVM), finite element method (FEM), or finite difference method (FDM). PDE-based models can be further classified by the motion of the vocal folds, which can be static, externally driven (prescribed motion) or self-excited due to fluid-structure-acoustic interaction (FSAI). Another crucial model property is dimensionality which has a substantial impact on the degrees of freedom (DOFs) and hence computation time. Thereby, it has to be noted that only three dimensional (3D) models are capable of reflecting the real physics since flow turbulence, which is a dominating phenomenon of phonation, requires three dimensions for a correct physical representation(vorticity distribution) [25].

Fig. 3.
figure 3

Classification of numerical models of human voice production

Lumped mass models

While lumped mass models were used in the past for principal investigations of the self-excited oscillation of the vocal folds [8, 12], they are still applied nowadays but in a three-dimensional (3D) arrangement (multi-mass models) to investigate material properties of the vocal folds [44, 45]. Due to the strong simplifications of this modeling approach, not all physical aspects can be captured. Therefore, PDE-based tools, which aim to resolve all physical details, are more promising for principal research as well as for clinical applications, which can be classified by the modeling approach of the vocal fold motion.

Self excited (FSI) PDE-based models

The most general approach is to model the interaction of the airflow and the structural mechanics of the vocal folds, which yields the self-excited oscillation of the vocal folds for certain flow and mechanic parameters. A first model considering fluid-structure interaction was introduced by Alipour et al. [1]. Thereby, the domain alteration was realized by mesh deformation, which induces distorted flow cells within the glottis. Thus, the immersed boundary method (IBM) was later applied [5, 13, 4648], which does not have this drawback. Besides ensuring the boundary conditions of the moving domain boundaries, a material model of the anisotropic tissue of the vocal folds has to be established. These two tasks require complex numeric procedures and high computational effort, which makes them not yet applicable in a clinical environment.

PDE-based models with static vocal folds

This modeling approach assumes a quasi-steady condition at specific instances in the oscillation cycle and aims to investigate the glottal aerodynamics but does not allow a thorough acoustic analysis [11, 24, 32].

Externally driven PDE-based models

Kaltenbacher et al. established a 2D FSAI-model which takes fluid-acoustic coupling into account by deploying acoustic perturbation equations within a hybrid aeroacoustic approach [17]. This model was simplified by Zörner et al., who replaced the fully-coupled interface condition by a one-way coupling [49]. Therewith, the costly coupled simulation was reduced to a pure computational fluid dynamic (CFD) simulation with a prescribed movement of the vocal folds and appropriate boundary conditions. In general, the presented results showed that a pure CFD-simulation with a prescribed structural movement can substitute the fully-coupled approach [49]. In simVoice, this modeling approach is applied in combination with a hybrid aeroacoustic approach based on the perturbed convective wave equation. The detailed description of the model with focus on the numerical efficiency is provided in [37], whereas a thorough description of the CFD incompressible flow simulation can be found in [31] and [30]. Furthermore, an extensive source term analysis for a validated synthetic setup is provided in [36] and the application to typical vocal cord dysfunctions is presented in [7].

3 Methods

Figure 4 illustrates the workflow of simVoice which is based on the hybrid aeroacoustic approach (Sect. 3.2) consisting of (1) a flow simulation, (2) an acoustic source term computation and interpolation and (3) an acoustic propagation simulation. Thereby, the acoustic source term excites the acoustic field and is defined by the right hand side of a the acoustic wave equation governing the propagation simulation and hence features the excitation of the acoustic field. In the acoustic propagation simulation, the VFs are fixed, whereas in the flow simulation they are externally driven. Therefore, the VF motion is obtained by fitting a vocal fold model (Sect. 3.1) to real physiological VF oscillations captured by glottal endoscopy.

Fig. 4.
figure 4

Workflow based on the hybrid aeroacoustic approach (Color figure online)

3.1 Geometry and vocal fold model

For modeling the VF-motion, the commonly used M5 model with two degrees of freedom (DOFs) and a sinusoidal glottal orifice is chosen (see Fig. 5) [33]. Thereby, each vocal fold has a degree of freedom (DOF) of translation \(y\) and rotation-DOF \(\phi \). Due to the rotation-DOF, the model is able to capture the characteristic convergent-to-divergent shape change of the glottal duct during an oscillation cycle [21, 22]. Furthermore, typical abnormalities and pathologies of vocal fold vibration, such as glottal insufficiency (incomplete VF closure), aperiodic or asymmetric oscillation can be reproduced. Figure 7 illustrates the VF model and provides the main dimensions, of the larynx. The false vocal folds (fVFs), also known as ventricular folds are modeled rigid since they are not excited by the glottal airstream. The temporal evolutions of the DOFs \(y\) and \(\phi \) are fitted to reproduce the measured glottal area waveform (GAW, see Fig. 6), which describes the temporal evolution of the area of the glottal orifice and can be obtained by high-speed videoendoscopy measurements during phonation in excised larynges (ex-vivo), as well as in living individuals (in-vivo). The vocal fold shape in the closed and the opened condition is shown in Fig. 7. Due to numerical issues of the flow solver, a gap of \(h_{\mathrm{{min}}}=0.2\) mm is remaining in the closed condition. A more detailed description of the applied vocal fold model and larynx geometry including the shape of the false vocal folds (ventricular folds) can be found in [30, 31].

Fig. 5.
figure 5

Geometry and principal measures of the synthetic larynx in coronal section at \(z=L/2\) (left) and axial section at \(x=0\) (right) with VF shape according to the M5 vocal fold model (all dimensions in mm)

Fig. 6.
figure 6

Glottal area waveform and evolvement of \(\phi (t)\) and \(y(t)\) with indication of the divergent and convergent phase of the synthetic glottal oscillation cycle according to [33]

Fig. 7.
figure 7

Synthetic vocal fold geometry

In the current version of simVoice, a simplified straight vocal tract with circular cross-sections according to [28] is used (see Fig. 8). Compared to a realistic physiological shape, the model complexity is reduced significantly while preserving the acoustic properties. A more detailed description of the simplification procedure can be found in [2, 28].

Fig. 8.
figure 8

Visualization the flow field by means of streamlines at \(t=0.5\, T\) (Color figure online)

3.2 Hybrid aeroacoustic approach

simVoice is based on the hybrid aeroacoustic approach which decomposes the task of sound computation into a flow computation and an acoustic propagation simulation [35]. For the latter part, the aeroacoustic source terms depending on the partial differential equation (PDE) governing the acoustic field (such as Lighthill acoustic analogy, acoustic perturbation equations, etc.) have to be computed from the flow field obtained in the first step [35]. The hybrid approach is commonly applied in computational aeroacoustics (CAA) [3, 16, 18, 27, 3941] due to several advantages:

  • Adjustment of simulation features to the specific requirements of the respective physical field. Since the characteristics and dominating phenomena of the flow and the acoustic field differ substantially, the numerical computation of these fields has their specific requirements:

    • Spatial discretization. The element size can be chosen individually for the flow and acoustic field (i.e. handle the disparity of length scales). Whereas in fluid dynamics boundary layers and turbulent eddies need to be resolved, in acoustics the acoustic waves have to be resolved. Furthermore, the domain of flow and acoustic simulation can differ. Fluid domain without an impact on the flow and containing no acoustic sources, such as acoustic propagation regions, can be neglected for the flow simulation.

    • Temporal resolution. While the minimum time step size of acoustics is defined by the maximum resolved frequency of the acoustic wave, the minimum time step size of CFD, in general, depends on turbulent phenomena that must be resolved.

  • Assumption of incompressible flow within the CFD simulation. For low Mach number flows (\(Ma<0.3\)), compressibility can be neglected and solving the incompressible Navier Stokes equations is sufficient for the subsequent acoustic source term computation. Thus, the computational demanding compressible flow simulation can be avoided.

In many applications, the reduction of computational effort due to valid simplification of the flow simulation outweighs the additional effort due to the acoustic simulation and the acoustic source term computation and treatment. Furthermore, when the source term computation is performed on the fly within the CFD-export, the additional effort is decreased.

For the present model, the perturbed convective wave equation (PCWE) is chosen to model the sound generation and propagation inside the acoustic domain. An advantage of this wave equation is its numerical efficiency, since it is a PDE with only one scalar unknown being the acoustic velocity potential \(\psi ^{\mathrm{a}}\). The derivation of so-called acoustic perturbation equations (APEs) [6] is based on the perturbation ansatz, which is applied to the quantities of the flow field

$$\begin{aligned} p =& \bar{p} + p^{\mathrm{ic}} + p^{\mathrm{c}} = \bar{p} + p^{\mathrm{ic}} + p^{\mathrm{a}} \end{aligned}$$
(1)
$$\begin{aligned} \boldsymbol{v} =& \bar {\boldsymbol{v}} + \boldsymbol{v}^{\mathrm{ic}} + \boldsymbol{v}^{\mathrm{c}} = \bar {\boldsymbol{v}} + \boldsymbol{v}^{\mathrm{ic}} + \boldsymbol{v}^{\mathrm{a}} \end{aligned}$$
(2)
$$\begin{aligned} \rho =& \bar{\rho}+ \rho _{1} + \rho ^{\mathrm{a}} \end{aligned}$$
(3)

according to [9]. Therewith, the pressure \(p\), the flow velocity \(\boldsymbol{v}\), and the density \(\rho \) are decomposed into a mean part (\(\bar{p}\), \(\bar {\boldsymbol{v}}\), \(\bar{\rho}\)) and a fluctuating part consisting of acoustic (\(p^{\mathrm{a}}\), \(\boldsymbol{v}^{\mathrm{a}}\), \(\rho ^{\mathrm{a}}\)) and incompressible flow components (\(p^{\mathrm{ic}}\), \(\boldsymbol{v}^{\mathrm{ic}}\)). Additionally, a density correction \(\rho _{1}\) is introduced.

By applying the splitting approach to the compressible flow equations, the APE-2 system can be derived [6], which was reformulated by Hüppe and Kaltenbacher [10, 15] to

$$ \frac{1}{c^{2}} \, \, \frac{D^{2}\psi ^{\mathrm{a}}}{D t^{2}} - \Delta \psi ^{\mathrm{a}}= - \frac{1}{\bar{\rho}c^{2}} \frac{D p^{\mathrm{ic}}}{D t} \, , $$
(4)

by introducing the acoustic velocity potential \(\psi ^{\mathrm{a}}\) defined as

$$ \boldsymbol{v}^{\mathrm{a}}= -\nabla \psi ^{\mathrm{a}}\, . $$
(5)

Thereby, the material parameters \(\bar{\rho}\) (incompressible fluid density) and \(c\) (speed of sound) depend on the condition of the airstream. For in-vivo phonation, the material parameters are set to \(\rho =1.145~\mbox{kg}\,\mbox{m}^{-3}\) and \(c=355~\mbox{m}\,\mbox{s}^{-1}\) corresponding to the condition of expiration air (\(T\approx 35\ {}^{\circ}\mbox{C}\), 95% relative humidity) whereas ex-vivo investigations ambient conditions have to be considered. The PDE describes aeroacoustic sources generated by incompressible flow structures and wave propagation through moving media. Since the solution quantity is a scalar unknown, this form has an enhanced numerical efficiency compared to the APE-2 system. The differential operator (material derivative) in Eq. (4) is defined by

$$ \frac{D}{Dt} = \frac{\partial }{\partial t} + \overline{ \boldsymbol{v}} \cdot \nabla $$
(6)

consisting of the partial time derivative and a convective term, where \(\overline{ \boldsymbol{v}}\) is the mean velocity. The right-hand side (RHS) of Eq. (4) defines the aeroacoustic source term

$$ \begin{aligned} f &= - \frac{1}{\bar{\rho}c^{2}} \, \frac{D p^{\mathrm{ic}}}{Dt} = - \frac{1}{\bar{\rho}c^{2}}\, \left ( \frac{\partial p^{\mathrm{ic}}}{\partial t} + \underbrace{\overline{ \boldsymbol{v}} \cdot \nabla p^{\mathrm{ic}}}_{ \text{convective source term}} \right ) \\ &\approx - \frac{1}{\bar{\rho}c^{2}}\, \frac{\partial p^{\mathrm{ic}}}{\partial t} \,, \end{aligned} $$
(7)

wherein the convective term can be neglected for low flow velocities \(\overline{ \boldsymbol{v}}\) due to the dominating partial time derivative especially at high frequencies. Since low flow velocity applies to human phonation, the simplification is exploited.

When the mean flow \(\overline{ \boldsymbol{v}}\) is also neglected in the wave operator on the left-hand side of Eq. (4), the weak formulation of the PCWE (4) reads as

$$\begin{aligned} &\frac{1}{c^{2}}\, \int \limits _{\Omega }\, \varphi \, \frac{\partial ^{2} \psi ^{\mathrm{a}}}{\partial t^{2}} \, {\,\mathrm{d}}\boldsymbol{x} + \int \limits _{\Omega }\, \nabla \varphi \, \cdot \nabla \psi ^{\mathrm{a}} \, {\,\mathrm{d}}\boldsymbol{x} = \\ &\int \limits _{\Gamma }\, \varphi \, \nabla \psi ^{\mathrm{a}}\cdot \boldsymbol{n} { \,\mathrm{d}}s - \frac{1}{\bar{\rho}c^{2}}\, \int \limits _{\Omega }\, \varphi \,f {\,\mathrm{d}}x \end{aligned}$$
(8)

with \(\varphi \) denoting the test function and the integrals being defined in the computational domain \(\Omega \) (volume term) and its boundary \(\Gamma \) (boundary term). This weak form of the PCWE is solved by the FE-based multi-physics solver openCFS for the solution quantity \(\psi ^{\mathrm{a}}\) [43]. The relation \(p^{\mathrm{a}}= \bar{\rho}\frac{D\psi ^{\mathrm{a}}}{D t}\) allows the calculation of the acoustic pressure \(p^{\mathrm{a}}\).

1) Computational fluid dynamics (CFD) simulation

To provide the flow field for the subsequent acoustic source term computation, the unsteady incompressible Navier-Stokes equations are solved by the CFD software Star-CCM+ (Siemens PLM Software, Plano, TX/USA). Compressible effects can be neglected due to the low flow velocity (mean convective Mach number inside the glottis \({\overline{ \mathrm{Ma}}} < 0.1\)). The computational domain comprises the larynx and the downstream vocal tract (Fig. 5). The overset mesh algorithm of Star-CCM+ is used to realize the vocal fold motion, which is based on the above introduced M5-model. This procedure does not allow full closure of the vocal folds. Thus, a gap of 0.2 mm is remaining in the closed condition, which causes a negligible leakage flow. A large eddy simulation (LES) in combination with the wall-adapting local eddy-viscosity (WALE) subgrid-scale model is used to model the turbulent flow. The computational grid consists of about 1.3 million hexahedral cells with two levels of refinement within the glottal region. Further details about the flow simulation can be found in [30, 31]. In Fig. 8 the resulting flow field at \(t=0.5\, T\) is visualized by means of streamlines.

2) Aeroacoustic source term computation and interpolation

Having determined the evolution of the flow field, the simplified acoustic source term of Eq. (7) can be computed on the CFD grid. The resulting source term at the time instance \(t=0.5\, T\) is displayed in Fig. 9. In a next step, the source term has to be interpolated from the CFD grid to the generally coarser acoustic grid (see Fig. 10). This is realized by a conservative interpolation scheme, which considers the intersection volumes of flow and acoustic cells and hence conserves the energy globally, as well as locally [34, 37, 38].

Fig. 9.
figure 9

Iso-Surfaces of the aeroacoustic source term \(f=\frac{\partial p}{\partial t}\) at \(t=0.5\, T\) with contour at \(\frac{\partial p^{\mathrm{ic}}}{\partial t} =\pm 6 \cdot 10^{5}\) Pa/s (Color figure online)

Fig. 10.
figure 10

Comparison of CFD (blue) and acoustic (red) mesh (Color figure online)

3) Computational acoustics simulation

Figure 11 shows the acoustic simulation setup. The static computational domain covers the CFD domain at the maximum opening of the vocal folds in order to capture all acoustic sources. The vocal fold motion can be neglected for the acoustic simulation because the glottal gap is acoustically compact since

$$ \mathrm{{He}}=\frac{f_{\mathrm{{max}}} \, h_{\mathrm{{max}}}}{c}=0.058 \ll 1 , $$
(9)

where \(h_{\mathrm{{max}}}=4~\mbox{mm}\) denotes the maximum glottal gap and \(f_{\mathrm{{max}}}=5~\mbox{kHz}\) the maximum resolved frequency. Thus, the changing acoustic domain has no impact on the wave propagation within the considered frequency range. In addition to the larynx and vocal tract, a propagation region is added, which represents the ambient air, where the voice is emitted. The adjacent perfectly matched layer (PML) region ensures free field radiation at the boundary [14], whereas at the larynx inlet wave reflections are avoided by an absorbing boundary condition (ABC) [14] at \(\Gamma _{\mathrm{{inlet}}}\). Compared to a PML, the ABC does not introduce additional DOFs but provides full absorption just for plane waves, which applies to the present model because

$$ f_{\mathrm{{max}}} \leq \frac{c}{2H}= 9861~\mbox{Hz} \, , $$
(10)

with \(H\) being the larger dimension of the duct cross-section (see Fig. 5) The remaining bounding surface \(\Gamma _{\mathrm{{wall}}}\) is modeled sound-hard by setting the acoustic particle velocity normal to the wall \(v_{n}=\nabla \psi ^{\mathrm{a}}\cdot \boldsymbol{n}\) to zero (natural boundary condition). In order to preserve element quality and meshing flexibility a non-conforming interface of type Nitsche is applied between the vocal tract and the propagation region, which is also implemented in openCFS [43]. The shape of the vocal tract in Fig. 11 is simplified from a realistic MRI-based human vocal tract (corresponding to the vowel ‘a’) to a straight rotational symmetric shape while preserving the acoustic filter characteristics according to the procedure presented in [2].

Fig. 11.
figure 11

Acoustic simulation setup with clipped propagation and PML region (left) and detail (right) of the clipped propagation and PML region (Color figure online)

The simulation setup is the result of a convergence study presented in [37]. Therein, the element size, the number of elements within the PML, and further aspects were successively modified to obtain an efficient acoustic simulation setup. An overview of cell sizes and element numbers is provided in Table 1. While the propagation region with adjacent PML region is meshed by linear hexahedral elements, linear tetrahedral cells are used for the larynx and the vocal tract to ensure an efficient mesh generation of complex structures. In total, the acoustic grid of the displayed configuration consists of \(93\, 200\) elements.

Table 1. Details of the computational grid

4 Results

In order to validate the numerical model simVoice, a measurement on a test rig with artificial vocal folds (M5-shaped) out of silicone and a simple cuboid-shaped vocal tract out of aluminum was performed [19, 20, 22, 23]. Thereby, the shape of the larynx and the vocal tract of the measurement and the simulation agreed. Figure 12 shows the resulting acoustic spectra of the measurement of 20 oscillation cycles of the VFs and the simulation, where the parts of the acoustic source term in Eq. (7) were considered separately as well as combined. The plot proofs that neglecting the convective source term is valid since its consideration only changes the spectrum insignificantly. Above 300 Hz, the spectrum of the measurement is well reproduced by the simulation. Reflections of the acoustic waves at the walls of the anechoic room in the measurement are thought to be the reason for the underestimation of the amplitude by the simulation underneath 300 Hz since this is the cut-off frequency of the anechoic room. A detailed description of the experimental setup is published in [19, 20, 22, 23].

Fig. 12.
figure 12

Resulting acoustic spectrum in the microphone point of the simulation and the measurement (Color figure online)

4.1 Computational efficiency

Since the goal of the present research work is a clinical application, a special focus is put on computational efficiency which is a crucial factor for the applicability in a clinical environment for patient specific treatment approaches. Table 2 summarizes the simulation details of the principal computational tasks of the hybrid workflow. It clearly shows that the bottleneck of the workflow is the CFD simulation demanding 94.5% of the total computation time. However, this emphasizes the potential of the hybrid approach in terms of computational efficiency, since the effort due to tasks introduced by the hybrid approach, is only about 5.5%. Therefore, the reduction of computational effort due to simplifications on the flow simulation (incompressible flow, reduction of computational domain) clearly outweighs the additional effort owing to the hybrid approach (source term computation and interpolation and acoustic propagation simulation). Moreover, circumventing FSI of the airflow and the vocal folds yields a massive reduction of computation time since the tissue has not to be discretized (no additional DOFs), no material model of the tissue is required and complex algorithms treating the moving boundary (VF surface) are avoided.

Table 2. Comparison of simulation details for voice computation of 15 VF oscillation cycles. For the source term computation and interpolation (ST) the cell number and time step size decisive for the computational effort is shown

5 Conclusion

A numerical model called simVoice, which is capable of predicting the voice based on a measured vocal fold motion and the geometry of the larynx and vocal tract was presented. The hybrid aeroacoustic approach based on the PCWE and the prescribed vocal fold motion, which circumvents modeling fluid-structure interaction, contributes to the high numerical efficiency of the model. However, the CFD simulation is still too computationally challenging to be performed on a conventional computer. Thus, special effort will be put into an efficiency-optimized flow simulation and source term computation. Moreover, the application of stochastic methods and artificial intelligence for the multiplication of flow data of a reduced number of oscillation cycles will be investigated. Nevertheless, numerical models capturing the crucial physical aspects of phonation will develop in the future and have the potential to support clinical work and to contribute to a deeper understanding of human voice production.