The basic Clavinet string model was presented first in  and described in . It consists of a digital waveguide loop structure  in which a fractional delay filter , a loss filter , a ripple filter , and a dispersion filter  are cascaded. This structure is fed by an attack excitation signal, generated on-line by a signal model dependent on an estimate of the virtual tangent velocity. Furthermore, the note decay is modeled by increasing the length of the delay line and increasing losses, i.e., decreasing loop gain. The string model is completed by several beating equalizers  modulating the gain of the first partials.
More details of this model will now be described.
3.1 String model
The Clavinet pitch is very stable during the sustain phase of the tone, and thus, there is no need for change in the overall DWG delay during sustain. Partial decay time analysis from Clavinet tones reveals ripply T
60 also shown by microphone-recorded tones. This can be easily reproduced by the use of a so-called ripple filter, which has been used for the emulation of other instruments as well, such as the harpsichord  and the piano .
The ripple filter adds a feedforward path with unity gain (which can be incorporated into the delay line) and adds a small amount of the direct signal to it with gain r. The analytic expression is the following:
where r is a small coefficient and R is the length of the delayed path length introduced by this filter. The effect of the ripple filter is shown in Figure 8 compared to the T
60 of a real tone. The gain at different partials or, conversely, the T
60 values are different from one another, enabling the emulation of the real tone behavior seen in Figure 8. Although from a visual inspection of the figures the fit between real and synthesized data may not seem close, from a perceptual standpoint, it must be noted that differences of several seconds in the T
60 times, i.e., of several decibels in the magnitude response for a given partial, do not result in a perceivable change, as they fall beneath audibility thresholds, as shown in  for the magnitude response of a loss filter in a DWG model.
By increasing or decreasing the r coefficient, the ripple effect is increased or decreased; by changing R, the width of the ripples is changed. R is in turn calculated from the parameter R
rate from the following:
and thus, the total delay line L
is now split into two sections of length R and L
−R. To maintain closed loop stability, the overall gain must be kept below unity, i.e., g+|r|<1, with g being the loss filter gain.
The ripple filter coefficients can be adjusted in order to match those observed in recorded tones. The ripple parameters in Figure 8, for instance, are R
rate=1/2 and r=−0.006. In the model, R
rate and r are randomly chosen at each keystroke respectively in the range between 1/2 to 1/3 and −0.006 to −0.001, according to observations.
The design of the dispersion filter follows the algorithm described in a. The algorithm achieves the desired B coefficient in a frequency band specified by the user. The authors suggest that this be at least 10 times the fundamental frequency. The B coefficients, the bandwidth (BW), and the β parameters used for every key are linearly interpolated from the values in Table 1.
The Clavinet tones may contain beating partials as shown in Figure 7. An efficient and easily tunable method to emulate this is to cascade a so-called beating equalizer, proposed in  with the DWG loop.
The beating equalizer is based on the Regalia-Mitra tunable filters  but adds a modulating gain at the output stage K[n], where n is the time index.
In brief, such a device is a band-pass filter with varying gain at the resonating frequency. The gain can vary according to an arbitrary function of time, but for the emulation of Clavinet tones, it has been decided to use a | cos(2π f n)| law, which well approximates the behavior seen in Figure 7 in Section 2.3.4. The modulated gain is the following:
In order to modulate M partials, M beating equalizers are needed. It was shown, however, by informal listening tests that it is difficult to perceive the effect of more than three beating equalizers working at the same time.
The computational cost of this device is low, consisting of a biquad filter plus the overhead of five operations per sample (three additions and two multiplications, as can be seen in  and Figure 2).
3.2 Excitation model
The string model described so far can be fed at attack time with an excitation signal of some kind. In the proposed model, the excitation signal consists of a smooth pulse similar to those seen in low- to mid-range tones. The pulse is made by joining an attack ramp with its reverse. The ramp is obtained by fitting the following polynomial to some pulses extracted from recorded tones:
The polynomial coefficients were calculated from several least square error fits to some portions of signals extracted from the recordings. These signals have a smooth triangular shape and represent the pickup output from the tangent hitting the string. A polynomial has been obtained with order P=6 and coefficients in descending order: −2.69E −8, 2.53E −6, −9.54E −5, 1.74E −3, −1.44E −2, 4.50E −2, −3.50E −2. This signal is scaled by a gain and stretched by interpolation according to the player dynamic, making it shorter or longer. To calculate the pulse length in samples N, the average key velocity v and the initial distance d between the tangent and stud are required; thus,
s is the sampling frequency. The average key velocity normally varies linearly in the range 1 to 4 m/s and is mapped to integers from 1 to 127, as per the Musical Instrument Digital Interface (MIDI) standard. Figure 12 shows piano and forte excitation signals calculated with our method.
The pulse signals seen in Clavinet tones have a smooth triangular shape and represent the pickup output from the tangent hitting the string. Most of the recorded tones exhibit a similar pulse at the beginning of the tone, hence making this a good approximation for the string excitation produced by the tangent in most cases. Because the signal extracted from the pickups is the time derivative of the string displacement at the pickup position, when using its approximation as an excitation, it must be ensured that the wave variables in the digital waveguide are also time-differentiated approximations of the displacement of the Clavinet string. This allows differentiation to be avoided when emulating the effect of pickups if these are linear devices. With nonlinear pickups (as it is the case), integration must be performed before the nonlinear stage.
3.3 Model for pickups
The proposed pickup model includes a comb effect dependent on the pickup position, the magnetic field distance nonlinearity, and the emulation of the pickup selector switches. The traveling waves reflected at the string termination are transduced by the pickups, thus creating a comb characteristic in frequency. This effect can be emulated by a comb filter with negative gain (ideally −1 for a stiff string) and a delay equal to the time needed for the wave to propagate from the pickup position to the string termination and back . As discussed in Section 2.3.5, string dispersion also affects the position of the comb notches. In , the amount of dispersion is shown to be equal to the string inharmonicity itself. A duplicate of the dispersion filter used in the string model could be added to the comb feedforward path to obtain this secondary effect. However, to achieve a trade-off between computational efficiency and sound quality, the duplicate filter has not been implemented as it would increase the computational cost by 25%.
The comb filter needs two parameters to be calculated: the delay in samples and the gain. The latter has been set to −1 for both the pickups as the string termination is assumed to only invert the incoming wave. The former can be calculated with a simple proportion after a direct measure of the pickup’s distance from the string termination: the physical string length to pickup distance ratio can be multiplied to the total delay line length L
The overall frequency response has not been modeled being perceptually flat (as discussed in Section 2.3.5).
The pickup nonlinearity reported in Section 2.3.5 can be implemented as an exponential or an N th-order polynomial. The latter has a lower computational cost, and it can be computed on modern DSP architectures with N−1 consecutive multiply-accumulate operations and N products following Horner’s method . The polynomial coefficients used are reported in Table 2.
Figure 13 compares the exponential fit to the simulated data and the polynomial fit. The exponential fit has a slightly lower root mean square error value, proving a better approximation to the pickup nonlinearity. The polynomial fit, however, scales better to embedded devices for its lower computational cost and higher precision.
Since the excitation is a velocity wave and the nonlinearity applies to a displacement wave, the signal must be integrated before the nonlinearity. For real-time scenarios, a leaky integrator can be used as the one proposed in . Afterwards the nonlinear block differentiation must be applied to emulate that performed by pickups . A simple first-order digital differentiator as in  is sufficient and suited for real-time operation.
3.4 Model for the amplifier
Analyses from Section 2.3.6 suggested that the amplifier and the tone switch frequency response can be modeled in the digital domain with simple infinite impulse response (IIR) digital filters, keeping the computational cost low. The tone stack consists of four first- or second-order filters which can be bypassed by a switch. Details about the filters are provided in Table 3. For emulation in the digital domain, the impedance Z
(s) is calculated for each filter in the Laplace domain and then transformed by bilinear transform in a digital transfer function H
(z). The parallel Z
(s) in the analog domain can hence be emulated by cascading the H
(z) filters in the digital domain.
As an example, 14 compares one of the tone switch combinations and its digital filter implementation.
Finally, the frequency response of the amplifier excluding the tone stack is emulated with digital shelf filters corresponding to the data provided in Section 2.3.6. A reliable estimate of the nonlinearity introduced by the transistors was not possible as a faithful transistor model was not available for the specific transistor models in the computer software used during tone switch simulations. The transistor nonlinearities  have been measured on a real Clavinet by the use of a tone generator and a signal analyzer. The input signal was a sine wave at 1 kHz of amplitude equal to the maximum one generated by pickups with normal polyphonic playing (400 mV) showing a total harmonic distortion (THD) of 1% with normal polyphonic playing, rising to 3.6% for the highest peaks during fortissimo chord playing, which, however, is obtained only very rarely. Considering the 1% THD data as the upper bound for normal playing, the nonlinear character of the amplifier has been neglected, considering that the generated harmonic content is likely to be masked by the Clavinet tones.
3.5 Tangent knock
A secondary feature of the Clavinet sound is the presence of a knock sound, due to the tangent hitting the stud and hence the soundboard. The presence of this knocking sound in the pickup recordings may seem curious, but it can easily be explained by the fact that the impact of the tangent with the soundboard stud involves the string which is placed between the two bodies and in contact with the soundboard and hence transmits part of the sound (including the modal resonances of the soundboard) through to the pickups.
This knocking sound is clearly audible in high tones, where its overlap with the tone harmonics is lower. In order to partially model this knock, a sample of this sound has been extracted from an E6 tone, where the fundamental frequency lies over 1,300 Hz. The knocking sound, which has most of its energy concentrated below 1,200 Hz, can be isolated by filtering out everything over the tone fundamental frequency.
In the proposed model, a triggered sample is used. The sample is the same for any key (the secondary importance of this element does not give a strong motivation for precise modeling). Additionally, a mild low-pass filter can be added with a slightly random cutoff frequency for each note triggering in order to reduce the sample repetitiveness.
3.6 Overview of the complete model and computational cost
The computational model described so far has been first implemented in Matlab®;. The target of this work has been the development of a low-complexity model that could fit a real-time computing platform; thus, the porting of that model did not require any particular change in structure for the subsequent real-time implementation. The computational model described so far, depicted in Figure 15, stands for both the Matlab and the real-time implementation.
To summarize the work done to build this model, an overview of the basic blocks will be given. The DWG model consists of the delay line, which is split into two sections (z
−R) and z
−R) in order to add the ripple filter. The DWG loop includes the one-pole loss filter H
targ(z) which adds frequency-dependent damping and the dispersion filter H
(z) which adds the inharmonicity characteristic to metal strings. The fractional delay filter F(z) accounts for the fractional part of L
which cannot be reproduced by the delay line.
While the Clavinet pitch during sustain is very stable, and thus there is no need for changing the delay length, a secondary delay line, representing the nonspeaking part of the string, is needed to model the pitch drop at release. This delay line z
is connected to the DWG loop at release time to model the key release mechanism.
To excite the DWG loop, there is the excitation generator block, named Excitation, which makes use of an algorithm described in  to generate the an excitation signal related to key velocity and data on the tangent to string distance. This is triggered just once at attack time.
Several blocks are cascaded in the DWG loop. The beating equalizer (B
EQ), composed of a cascade of selective bandpass filters with modulated gain, emulates the beating of the partial harmonics and completes the string model. Then, the Pickup block emulates the effect of pickups, while the Amplifier emulates the amplifier frequency response, including the effect of the tone switches.
Finally, the soundboard knock sample is triggered at a ‘note on’ event to reproduce that feature of the Clavinet tone. This is similar to what has been done for the emulation of the clavichord , an instrument that shows some similarities with the Clavinet.
The theoretical computational cost of the complete model can be estimated for the worst case conditions and is reported in Table 4. The worst case conditions occur for the lowest tone (F1), which needs the longest delay line and the highest order for the dispersion filter. The latter depends on the estimate of the B coefficients made during the analysis phase and the parameters used to design the filter. With the current data, the maximum order of the dispersion filter is eight.
The memory consumption is mostly due to the delay lines, which, at a 44,100-Hz sampling frequency, require at most 923 samples (a longer delay line is not required as the dispersion filter takes into account a part of the loop delay), which, together with the taps required by comb filters, can amount to approximately 1,000 samples of memory per string.