Amplitude

This chapter describes Ville Pulkki’s famous vector-base amplitude panning (VBAP) as the most robust and generic algorithm of amplitude panning that works on nearly any surrounding loudspeaker layout. VBAP activates the smallest-possible number of loudspeakers, which gives a directionally robust auditory event localization for virtual sound sources, but it can also cause ﬂuctuations in width and coloration for moving sources. Multiple-direction amplitude panning (MDAP) proposed by Pulkki is a modiﬁcation that increases the number of activated loud-speakers. In this way, more direction-independence is achieved at the cost of an increased perceived source width and reduced localization accuracy at off-center positions. As vector-base panning methods rely on convex hull triangulation, irregular loudspeaker layouts yielding degenerate vector bases can become a problem. Imaginary loudspeaker insertion and downmix is shown as robust method improving the behavior, in particular for smaller surround-with-height loudspeaker layouts. The chapter concludes with some practical examples using free software tools that accomplish amplitude panning on vector bases.


Vector-Base Amplitude Panning (VBAP)
Assuming the r V model to predict the perceived direction, an intended auditory event at a panning direction θ , we call it the virtual source, can theoretically be controlled by the criterion according to V. Pulkki [2] θ = L l=1g l θ l . (3.1) Here, θ l are the direction vectors of the loudspeakers involved and the amplitude weightsg l need to be normalized for constant loudness Moreover, the weights g l should always stay positive to avoid in-head localization or other irritating listening experiences. For loudspeaker rings around the horizon, always 1 or 2 loudspeakers will be contributing to the auditory event, for loudspeakers arranged on a surrounding sphere, always 1 up to 3 loudspeakers will be used, whose directions must enclose the direction of the desired auditory event, the virtal source.
For the directional stability of the auditory event, the angle enclosed between the loudspeakers should stay smaller than 90 • . The system of equations for VBAP [2] uses 3 loudspeaker directions and gains to model the panning direction θ θ = [θ 1 , θ 2 , θ 3 ] ⎡ ⎣g 1 g 2 g 3 ⎤ ⎦ = L ·g ⇒g = L −1 θ, g =g g . (3. 3) The selection of the activated loudspeaker triplet is preceded by forming all triplets of the convex hull spanned by all the given playback loudspeakers. To find the loudspeaker triplet that needs to be activated, the list of all triplets is being searched for the one with all-positive weights, g 1 ≥ 0, g 2 ≥ 0, g 3 ≥ 0. Figure 3.1 shows the localization curve for VBAP between a loudspeaker at 0 • and 45 • for a centrally seated listener and one shifted to the left. The experiment is described in [4] and results were gathered by a 1.8 m circle of 8 loudspeakers, and listeners indicated the perceived direction by naming numbers from a 5 • scale mounted on the loudspeaker setup. Black whiskers of the results (95% confidence intervals and medians) for the centrally seated listener indicate a mismatch between slope of the perceived angles with VBAP; the ideal curve is represented by the dashed line and the mismatch can be understood by a better match of other exponents γ in Fig. 2.6. The directional spread is quite narrow. For an off-center left-shifted listening position the perceived directions is shown in terms of a 5 • histogram (gray bubbles) in Fig. 3 mapping seems to be monotonic with the panning angle, and the perceived direction stays within the loudspeaker pair, which is a robust result, at least. In Fig. 3.2 we see that responses from [5] in which the panning angle was adjusted to match reference loudspeakers set up in steps of 15 • on amplitude-panned lateral loudspeaker pairs fairly match the reference directions using VBAP. The r E vector model (black curve) delivers a better match with only one exception at 105 • . This motivates VBIP as alternative strategy. (

3.4)
This formulation appears more contemporary due to the excellent match of the r E model to predict experimental results, as shown earlier.

Non-smooth VBAP/VBIP width.
If one of the loudspeakers is exactly aligned with the virtual source for either VBAP or VBIP, e.g. θ 1 = θ , the resulting gains are g 1,2,3 = (1, 0, 0), and therefore only 1 loudspeaker will be activated. For a virtual source between the 2 loudspeakers, e.g. θ 1 + θ 2 ∝ θ , then we obtain g 1,2,3 = (1, 1, 0)/ √ 2, and hereby only 2 loudspeakers will be active. This behavior in particular yields audible variation in the perceived width and coloration. For virtual source movements that cross a common edge of neighboring loudspeaker triplets, there will often be unexpectedly intense jumps that are quite pronounced.

Multiple-Direction Amplitude Panning (MDAP)
In order to adjust the r E or r V vector not only directionally but also in length, and thus to control the number of active loudspeakers for moving sound objects, Pulkki extended VBAP to multiple-direction amplitude panning (MDAP [3]). Hereby not only the perceived width but also the coloration can be held constant.

Direction spread in MDAP.
MDAP employs more than one virtual source distributed around the panning direction as a directional spreading strategy. For horizontal loudspeaker rings, MDAP can consist of a pair of virtual VBAP sources at the angle ±α around the panning direction ϕ s ± α. In a ring of L loudspeakers with uniform angular spacing of 360 • L , the angle α = 90% 180 • L yields optimally flat width for all panning directions, as shown for L = 6 in comparison between MDAP and VBAP in Fig. 3.4. Moreover, MDAP seems to equalize the aiming of the r E measure to the aiming of the r V measure, which is the one controlled by VBAP and MDAP.
Listening experiment results. Experiments from [4] in Fig. 3.5 investigate the perceived width for two possible horizontal loudspeaker ring layouts, both with 45 • spacings, but one starting at 0 • ("0") the other at 22.5 • ("1/2"). Widths of MDAP with a direction spread of α = 22.5 • are perceived as significantly similar on both ring layouts, while VBAP yields significantly narrower results for panning onto the frontal loudspeaker in the "0" layout, which activates a single loudspeaker, only. Note that VBAP1/2 and MDAP1/2 are identical with α = 22.5 • and were treated as one condition. Moreover, a more constant width measure also describes a more constant number of activated loudspeakers while panning. Figure 3.6 shows that listeners can hear the difference in coloration changes with rotatory panning using pink noise and a constant speed. The figure shows that coloration fluctuations of MDAP are always clearly smaller than with VBAP on similar loudspeaker rings. Moreover, coloration changes are more pronounced on rings of 16 loudspeakers than with 8 loudspeakers, which is explained by their faster fluctuation. Figure 3.7 shows the results from [6] for a central and left-shifted off-center listening position when using MDAP on an 8-channel ring of loudspeakers. At the central listening position, the perceived directional spread around the loudspeaker positions 0 • and 45 • , obviously increases as expected, as indicated by the whiskers (95% confidence intervals and medians). Moreover, the spread of MDAP seems to slightly decrease the slope mismatch between the underlying VBAP algorithm and the perceptual curve around the 22.5 • direction.
Despite MDAP enforces a larger number of active loudspeakers, its localization is still similarly robust as the one of VBAP, also at on off-center listening positions. The perceived direction can be assumed to stay at least confined within a strictly directionally limited activation of loudspeakers. Correspondingly, the perceived directions shown in the gray 5 • -histogram bubbles of Fig. 3.7 indicate the perceived directions when the listener is located left-shifted off-center. While localization is slightly attracted by the closer loudspeaker at 0 • , the larger spread causes a more monotonic outcome that is less split than with VBAP in Fig. 3.1.
For a more exhaustive study, Frank used 6 loudspeakers on the horizon and gave the task to his listeners to align an MDAP pink-noise direction to match acoustical references every 15 • (harmonic complex) by adjusting the panning direction [5]. The results in Fig. 3.8 contain 24 answers from 6 subjects responding four times At least mathematically, however, it requires to post optimize the amplitudes and angles of the virtual sources in order to accurately match the desired r V or r E vector in direction and length on irregular loudspeaker arrangements, cf. [7]. Non-uniform r V vector lengths of the individual virtual sources involved cause a distorted resultant vector. In particular, their superposition is distorted towards those of the multiple virtual source directions with the longest r V vectors. Epain's article [7] proposes optimization retrieving optimal orientation and weighting of the multiple virtual sources for every panning direction.

Challenges in 3D Triangulation: Imaginary Loudspeaker Insertion and Downmix
Surrounding loudspeaker hemispheres typically exhibit the following two problems, in most cases: • Loudspeaker rectangles at the sides of standard setups with height (ITU-R BS.2051-0 [8]) can be decomposed ambiguously into triangles at the sides, back, and top. This can yield noticeable ranges within loudspeaker quadrilaterals, in which auditory events are unexpectedly created by just two of the loudspeakers. • Signals of virtual sources below the horizon usually get lost.
The problem of unfavorable or ambiguous triangulations into loudspeaker triplets appears subtle, however, it can cause clearly audible deficiencies. Especially when ambiguous triangulation yields asymmetric behavior between left and right, e.g., for the top, rear, and lateral directions, where we would manually define loudspeaker quadrilaterals instead of triangles, see [9].
As surrounding loudspeaker hemispheres are typically open by 180 • towards below, VBAP/ VBIP/ MDAP is numerically unstable and theoretically useless for any panning direction below. Despite the absence of loudspeakers below renders downwards amplitude panning theroretically infeasible, it is still reasonable to preserve signals of virtual sources that are meant for playback on spherically surrounding setups.
In the case of the asymmetric loudspeaker rectangles, see Fig. 3.10, and a missing lower hemisphere of surrounding loudspeakers, the insertion of one or more imaginary loudspeakers in the vertical direction (nadir) or in the middle of the rectangle (the average direction vector) has proven to be a useful strategy, e.g. in [10]. Any imaginary loudspeaker aims at either extending the admissible triangulation towards open parts of the surround loudspeaker setup, or to cover for parts with potential asymmetry, see [9]. The signal of the imaginary loudspeaker can be dealt with in two ways • it can be dismissed, e.g., for loudspeaker below at nadir, this would still yield a signal near the closest horizontal pair of loudspeakers for virtual sources panned to below-horizontal directions unless panned exactly to nadir • it can be down-mixed to the neighboring M loudspeakers by a factor of 1 √ M , or less as in Fig. 3.10; alternatively for control yielding perfectly flat E measures, the resulting down-mixed gain vector can be re-normalized by Eq. (3.2).

VBAP/MDAP Object for Pd
There is a classic VBAP/MDAP implementation by Ville Pulkki that is available as external in pure data (Pd). The example in Fig. 3.11 illustrates its use together with some other useful externals in Pd. Software requirements are: • pure-data (free, http://puredata.info/downloads/pure-data) • iemmatrix (free download within pure-data) • zexy (free download within pure-data) • vbap (free download within pure-data).

SPARTA Panner Plugin
The SPARTA Panner under http://research.spa.aalto.fi/projects/sparta_vsts/plugins.html provides a vector-base amplitude panning interface (VBAP) and multiple-direction amplitude panning (MDAP), see Fig. 3.12, with frequency-dependent loudness normalization by p L l=0 g p l adjustable to the listening conditions, see Laitinen [11]. The parameter DTT can be varied between 0 (standard, frequency-independent VBAP normalization, i.e. diffuse-field normalization), 0.5 for typical listening environments, and 1 for the anechoic chamber. The plugin allows to either manually enter the azimuth and elevation angles of multiple panning directions (if more than one input signal is used) and for the playback loudspeakers, or import/export from/to preset files. Of course all panning directions can be time-varying and be moved per mouse, automations, or controls. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.