Amplitude Panning Using Vector Bases

Zotter, Franz; Frank, Matthias

doi:10.1007/978-3-030-17207-7_3

Franz Zotter⁵ &
Matthias Frank⁵

Part of the book series: Springer Topics in Signal Processing ((STSP,volume 19))

32k Accesses
1 Citations

Abstract

This chapter describes Ville Pulkki’s famous vector-base amplitude panning (VBAP) as the most robust and generic algorithm of amplitude panning that works on nearly any surrounding loudspeaker layout. VBAP activates the smallest-possible number of loudspeakers, which gives a directionally robust auditory event localization for virtual sound sources, but it can also cause fluctuations in width and coloration for moving sources. Multiple-direction amplitude panning (MDAP) proposed by Pulkki is a modification that increases the number of activated loudspeakers. In this way, more direction-independence is achieved at the cost of an increased perceived source width and reduced localization accuracy at off-center positions. As vector-base panning methods rely on convex hull triangulation, irregular loudspeaker layouts yielding degenerate vector bases can become a problem. Imaginary loudspeaker insertion and downmix is shown as robust method improving the behavior, in particular for smaller surround-with-height loudspeaker layouts. The chapter concludes with some practical examples using free software tools that accomplish amplitude panning on vector bases.

The method is straightforward and can be used on many occasions succesfully.

Ville Pulkki [1], Ph.D. Thesis, 2001.

You have full access to this open access chapter, Download chapter PDF

Vector-base amplitude panning (VBAP) was extensively described and investigated in [2], alongside with the stabilization of moving sources by adding spread with multiple-direction amplitude panning (MDPA) [3]. Since then, VBAP and MDAP have been becoming the most common and popular amplitude panning techniques, which is particularly robust and can automatically adapt to specific playback layouts.

3.1 Vector-Base Amplitude Panning (VBAP)

Assuming the $\varvec{r}_\mathrm {V}$ model to predict the perceived direction, an intended auditory event at a panning direction $\varvec{\theta }$, we call it the virtual source, can theoretically be controlled by the criterion according to V. Pulkki [2]

$$\begin{aligned} \varvec{\theta }&=\sum _{l=1}^\mathrm {L}\tilde{g}_l\,\varvec{\uptheta }_l. \end{aligned}$$

(3.1)

Here, $\varvec{\uptheta }_l$ are the direction vectors of the loudspeakers involved and the amplitude weights $\tilde{g}_l$ need to be normalized for constant loudness

$$\begin{aligned} g_l&=\frac{\tilde{g}_l}{\sqrt{\sum _{l=1}^\mathrm {L}\tilde{g}_l^2}}. \end{aligned}$$

(3.2)

Moreover, the weights $g_l$ should always stay positive to avoid in-head localization or other irritating listening experiences. For loudspeaker rings around the horizon, always 1 or 2 loudspeakers will be contributing to the auditory event, for loudspeakers arranged on a surrounding sphere, always 1 up to 3 loudspeakers will be used, whose directions must enclose the direction of the desired auditory event, the virtal source. For the directional stability of the auditory event, the angle enclosed between the loudspeakers should stay smaller than $90^\circ $.

The system of equations for VBAP [2] uses 3 loudspeaker directions and gains to model the panning direction $\varvec{\theta }$

$$\begin{aligned} \varvec{\theta }&=[\varvec{\uptheta }_1,\,\varvec{\uptheta }_2,\,\varvec{\uptheta }_3]\begin{bmatrix} \tilde{g}_1\\ \tilde{g}_2\\ \tilde{g}_3 \end{bmatrix}=\mathbf {L}\cdot \varvec{\tilde{g}}&\Rightarrow \varvec{\tilde{g}}&=\mathbf {L}^{-1}\,\varvec{\theta },&\varvec{g}&=\frac{\varvec{\tilde{g}}}{\Vert \varvec{\tilde{g}}\Vert }. \end{aligned}$$

(3.3)

The selection of the activated loudspeaker triplet is preceded by forming all triplets of the convex hull spanned by all the given playback loudspeakers. To find the loudspeaker triplet that needs to be activated, the list of all triplets is being searched for the one with all-positive weights, $ g_1\ge 0$, $g_2\ge 0$, $g_3\ge 0$.

Figure 3.1 shows the localization curve for VBAP between a loudspeaker at $0^\circ $ and $45^\circ $ for a centrally seated listener and one shifted to the left. The experiment is described in [4] and results were gathered by a 1.8 m circle of 8 loudspeakers, and listeners indicated the perceived direction by naming numbers from a $5^\circ $ scale mounted on the loudspeaker setup. Black whiskers of the results ($95\%$ confidence intervals and medians) for the centrally seated listener indicate a mismatch between slope of the perceived angles with VBAP; the ideal curve is represented by the dashed line and the mismatch can be understood by a better match of other exponents $\gamma $ in Fig. 2.6. The directional spread is quite narrow. For an off-center left-shifted listening position the perceived directions is shown in terms of a $5^\circ $ histogram (gray bubbles) in Fig. 3.1. For this off-center position, it becomes clear that the closest loudspeaker dominates localization within a third of the panning directions. Still, the directional mapping seems to be monotonic with the panning angle, and the perceived direction stays within the loudspeaker pair, which is a robust result, at least.

In Fig. 3.2 we see that responses from [5] in which the panning angle was adjusted to match reference loudspeakers set up in steps of $15^\circ $ on amplitude-panned lateral loudspeaker pairs fairly match the reference directions using VBAP. The $\varvec{ r}_\mathrm {E}$ vector model (black curve) delivers a better match with only one exception at $105^\circ $. This motivates VBIP as alternative strategy.

Vector-Base Intensity Panning (VBIP). With nearly the same set of equations, but improving the perceptual mapping by the squares of the weights, the auditory event can be controlled corresponding to the direction of the $\varvec{ r}_\mathrm {E}$ vector

$$\begin{aligned} \varvec{\theta }&=[\varvec{\uptheta }_1,\,\varvec{\uptheta }_2,\,\varvec{\uptheta }_3]\begin{bmatrix} \tilde{g}_1^2\\ \tilde{g}_2^2\\ \tilde{g}_3^2 \end{bmatrix}=\mathbf {L}\cdot \varvec{\tilde{g}}_\mathrm {sq}&\Rightarrow \varvec{\tilde{g}}_\mathrm {sq}&=\mathbf {L}^{-1}\,\varvec{\theta },&\varvec{\tilde{g}}&=\begin{bmatrix} \sqrt{\tilde{g}_{\mathrm {sq}1}}\\ \sqrt{\tilde{g}_{\mathrm {sq}2}}\\ \sqrt{\tilde{g}_{\mathrm {sq}3}} \end{bmatrix},&\varvec{ g}&=\frac{\varvec{\tilde{g}}}{\Vert \varvec{\tilde{g}}\Vert }. \end{aligned}$$

(3.4)

This formulation appears more contemporary due to the excellent match of the $\varvec{ r}_\mathrm {E}$ model to predict experimental results, as shown earlier.

Non-smooth VBAP/VBIP width. If one of the loudspeakers is exactly aligned with the virtual source for either VBAP or VBIP, e.g. $\varvec{\uptheta }_1=\varvec{\theta }$, the resulting gains are $g_{1,2,3}=(1,0,0)$, and therefore only 1 loudspeaker will be activated. For a virtual source between the 2 loudspeakers, e.g. $\varvec{\uptheta }_1+\varvec{\uptheta }_2\propto \varvec{ \theta }$, then we obtain $g_{1,2,3}=(1,1,0)/\sqrt{2}$, and hereby only 2 loudspeakers will be active. This behavior in particular yields audible variation in the perceived width and coloration. For virtual source movements that cross a common edge of neighboring loudspeaker triplets, there will often be unexpectedly intense jumps that are quite pronounced.

Figure 3.3 illustrates the variation of the perceived width with VBAP on an octahedral arrangements of loudspeakers in the directions $\varvec{\uptheta }_l^\mathrm {T}\in \{[\pm 1,0,0],[0,\pm 1,0], [0,0,\pm 1]\}$.

3.2 Multiple-Direction Amplitude Panning (MDAP)

In order to adjust the $\varvec{ r}_\mathrm {E}$ or $\varvec{ r}_\mathrm {V}$ vector not only directionally but also in length, and thus to control the number of active loudspeakers for moving sound objects, Pulkki extended VBAP to multiple-direction amplitude panning (MDAP [3]). Hereby not only the perceived width but also the coloration can be held constant.

Direction spread in MDAP. MDAP employs more than one virtual source distributed around the panning direction as a directional spreading strategy. For horizontal loudspeaker rings, MDAP can consist of a pair of virtual VBAP sources at the angle $\pm \alpha $ around the panning direction $\varphi _\mathrm {s}\pm \alpha $. In a ring of $\mathrm {L}$ loudspeakers with uniform angular spacing of $\frac{360^\circ }{\mathrm {L}}$, the angle $\alpha =90\%\frac{180^\circ }{\mathrm {L}}$ yields optimally flat width for all panning directions, as shown for $\mathrm {L}=6$ in comparison between MDAP and VBAP in Fig. 3.4. Moreover, MDAP seems to equalize the aiming of the $\varvec{ r}_\mathrm {E}$ measure to the aiming of the $\varvec{ r}_\mathrm {V}$ measure, which is the one controlled by VBAP and MDAP.

Listening experiment results. Experiments from [4] in Fig. 3.5 investigate the perceived width for two possible horizontal loudspeaker ring layouts, both with $45^\circ $ spacings, but one starting at $0^\circ $ (“0”) the other at $22.5^\circ $ (“1/2”). Widths of MDAP with a direction spread of $\alpha =22.5^\circ $ are perceived as significantly similar on both ring layouts, while VBAP yields significantly narrower results for panning onto the frontal loudspeaker in the “0” layout, which activates a single loudspeaker, only. Note that VBAP1/2 and MDAP1/2 are identical with $\alpha =22.5^\circ $ and were treated as one condition.

Moreover, a more constant width measure also describes a more constant number of activated loudspeakers while panning. Figure 3.6 shows that listeners can hear the difference in coloration changes with rotatory panning using pink noise and a constant speed. The figure shows that coloration fluctuations of MDAP are always clearly smaller than with VBAP on similar loudspeaker rings. Moreover, coloration changes are more pronounced on rings of 16 loudspeakers than with 8 loudspeakers, which is explained by their faster fluctuation.

Figure 3.7 shows the results from [6] for a central and left-shifted off-center listening position when using MDAP on an 8-channel ring of loudspeakers. At the central listening position, the perceived directional spread around the loudspeaker positions $0^\circ $ and $45^\circ $, obviously increases as expected, as indicated by the whiskers ($95\%$ confidence intervals and medians). Moreover, the spread of MDAP seems to slightly decrease the slope mismatch between the underlying VBAP algorithm and the perceptual curve around the $22.5^\circ $ direction.

Despite MDAP enforces a larger number of active loudspeakers, its localization is still similarly robust as the one of VBAP, also at on off-center listening positions. The perceived direction can be assumed to stay at least confined within a strictly directionally limited activation of loudspeakers. Correspondingly, the perceived directions shown in the gray $5^\circ $-histogram bubbles of Fig. 3.7 indicate the perceived directions when the listener is located left-shifted off-center. While localization is slightly attracted by the closer loudspeaker at $0^\circ $, the larger spread causes a more monotonic outcome that is less split than with VBAP in Fig. 3.1.

For a more exhaustive study, Frank used 6 loudspeakers on the horizon and gave the task to his listeners to align an MDAP pink-noise direction to match acoustical references every $15^\circ $ (harmonic complex) by adjusting the panning direction [5]. The results in Fig. 3.8 contain 24 answers from 6 subjects responding four times (by repetition and symmetrization). The black line shows directions indicated by the $\varvec{ r}_\mathrm {E}$ vector model for the tested conditions. Obviously, the confidence intervals of the adjusted MDAP angles match quite well both the reference directions and predictions by the $\varvec{ r}_\mathrm {E}$ vector model, in particular for angles between $0^\circ $ and $90^\circ $ (except $75^\circ $) for the ring starting at $0^\circ $, and from $0^\circ $ to $120^\circ $ for the $30^\circ $-rotated ring. The mismatch is much less than $4^\circ $ for panning angles $\le 90^\circ $.

MDAP with 3D loudspeaker layouts. For more arbitrary 3D loudspeaker arrangements, multiple-directions could be arranged ring-like, see Fig. 3.9. This arrangement uses 8 additional virtual sources inclined by $45^\circ $ wrt. the main virtual source.

At least mathematically, however, it requires to post optimize the amplitudes and angles of the virtual sources in order to accurately match the desired $\varvec{ r}_\mathrm {V}$ or $\varvec{ r}_\mathrm {E}$ vector in direction and length on irregular loudspeaker arrangements, cf. [7]. Non-uniform $\varvec{ r}_\mathrm {V}$ vector lengths of the individual virtual sources involved cause a distorted resultant vector. In particular, their superposition is distorted towards those of the multiple virtual source directions with the longest $\varvec{ r}_\mathrm {V}$ vectors. Epain’s article [7] proposes optimization retrieving optimal orientation and weighting of the multiple virtual sources for every panning direction.

3.3 Challenges in 3D Triangulation: Imaginary Loudspeaker Insertion and Downmix

Surrounding loudspeaker hemispheres typically exhibit the following two problems, in most cases:

Loudspeaker rectangles at the sides of standard setups with height (ITU-R BS.2051-0 [8]) can be decomposed ambiguously into triangles at the sides, back, and top. This can yield noticeable ranges within loudspeaker quadrilaterals, in which auditory events are unexpectedly created by just two of the loudspeakers.
Signals of virtual sources below the horizon usually get lost.

The problem of unfavorable or ambiguous triangulations into loudspeaker triplets appears subtle, however, it can cause clearly audible deficiencies. Especially when ambiguous triangulation yields asymmetric behavior between left and right, e.g., for the top, rear, and lateral directions, where we would manually define loudspeaker quadrilaterals instead of triangles, see [9].

As surrounding loudspeaker hemispheres are typically open by $180^\circ $ towards below, VBAP/ VBIP/ MDAP is numerically unstable and theoretically useless for any panning direction below. Despite the absence of loudspeakers below renders downwards amplitude panning theroretically infeasible, it is still reasonable to preserve signals of virtual sources that are meant for playback on spherically surrounding setups.

In the case of the asymmetric loudspeaker rectangles, see Fig. 3.10, and a missing lower hemisphere of surrounding loudspeakers, the insertion of one or more imaginary loudspeakers in the vertical direction (nadir) or in the middle of the rectangle (the average direction vector) has proven to be a useful strategy, e.g. in [10]. Any imaginary loudspeaker aims at either extending the admissible triangulation towards open parts of the surround loudspeaker setup, or to cover for parts with potential asymmetry, see [9].

The signal of the imaginary loudspeaker can be dealt with in two ways

it can be dismissed, e.g., for loudspeaker below at nadir, this would still yield a signal near the closest horizontal pair of loudspeakers for virtual sources panned to below-horizontal directions unless panned exactly to nadir
it can be down-mixed to the neighboring $\mathrm {M}$ loudspeakers by a factor of $\frac{1}{\sqrt{\mathrm {M}}}$, or less as in Fig. 3.10; alternatively for control yielding perfectly flat E measures, the resulting down-mixed gain vector can be re-normalized by Eq. (3.2).

3.4 Practical Free-Software Examples

3.4.1 VBAP/MDAP Object for Pd

There is a classic VBAP/MDAP implementation by Ville Pulkki that is available as external in pure data (Pd). The example in Fig. 3.11 illustrates its use together with some other useful externals in Pd. Software requirements are:

pure-data (free, http://puredata.info/downloads/pure-data)
iemmatrix (free download within pure-data)
zexy (free download within pure-data)
vbap (free download within pure-data).

3.4.2 SPARTA Panner Plugin

The SPARTA Panner under http://research.spa.aalto.fi/projects/sparta_vsts/plugins.html provides a vector-base amplitude panning interface (VBAP) and multiple-direction amplitude panning (MDAP), see Fig. 3.12, with frequency-dependent loudness normalization by $\root p \of {\sum _{l=0}^\mathrm {L}g^p_l}$ adjustable to the listening conditions, see Laitinen [11].

The parameter DTT can be varied between 0 (standard, frequency-independent VBAP normalization, i.e. diffuse-field normalization), 0.5 for typical listening environments, and 1 for the anechoic chamber. The plugin allows to either manually enter the azimuth and elevation angles of multiple panning directions (if more than one input signal is used) and for the playback loudspeakers, or import/export from/to preset files. Of course all panning directions can be time-varying and be moved per mouse, automations, or controls.

References

V. Pulkki, Spatial sound generation and perception by amplitude panning techniques, Ph.D. dissertation, Helsinki University of Technology (2001)
Google Scholar
V. Pulkki, Virtual sound source positioning using vector base amplitude panning. J. Audio Eng. Soc. 45(6), 456–466 (1997)
Google Scholar
V. Pulkki, Uniform spreading of amplitude panned virtual sources, in Proceedings of the Applications of Signal Processing to Audio and Acoustics (WASPAA), IEEE (New Paltz, NY, 1999)
Google Scholar
M. Frank, Phantom sources using multiple loudspeakers in the horizontal plane, Ph.D. Thesis, Kunstuni Graz (2013)
Google Scholar
M. Frank, F. Zotter, Extension of the generalized tangent law for multiple loudspeakers, in Fortschritte der Akustik - DAGA (Kiel, 2017)
Google Scholar
M. Frank, Source width of frontal phantom sources: perception, measurement, and modeling. Arch. Acoust. 38(3), 311–319 (2013)
Article Google Scholar
N. Epain, C.T. Jin, F. Zotter, Ambisonic decoding with constant angular spread. Acta Acust. United Acust. 100(5) (2014)
Article Google Scholar
ITU, Recommendation BS.2051: Advanced sound system for programme production (ITU, 2018)
Google Scholar
M. Romanov, M. Frank, F. Zotter, T. Nixon, Manipulations improving amplitude panning on small standard loudspeaker arrangements for surround with height, in Tonmeistertagung (2016)
Google Scholar
F. Zotter, M. Frank, All-round ambisonic panning and decoding. J. Audio Eng. Soc. (2012)
Google Scholar
M.-V. Laitinen, J. Vilkamo, K. Jussila, A. Politis, V. Pulkki, Gain normalization in amplitude panning as a function of frequency and room reverberance, in Proceedings 55th AES Conference (Helsinki, 2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Electronic Music and Acoustics, University of Music and Performing Arts, Graz, Austria
Franz Zotter & Matthias Frank

Authors

Franz Zotter
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Frank
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Franz Zotter .

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zotter, F., Frank, M. (2019). Amplitude Panning Using Vector Bases. In: Ambisonics. Springer Topics in Signal Processing, vol 19. Springer, Cham. https://doi.org/10.1007/978-3-030-17207-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-17207-7_3
Published: 01 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17206-0
Online ISBN: 978-3-030-17207-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics