Psychoacoustic characteristics of different brake creep groan classes and their subjective noise annoyance in vehicle and half-axle tests

Brake creep groan is a severely annoying noise and vibration phenomenon. Especially on the Asian market, customer feedback about creep groan is common, indicating creep groan’s impact towards the quality impression of a car. Hence, treatment of these stick–slip-related creep groan phenomena is necessary. As numerous design conflicts exist for brake and axle, a complete mitigation of the phenomenon is often not possible. A reduction of creep groan’s annoyance by changing the noise’s level and characteristics is therefore typically aspired. One approach towards this goal could include the usage of psychoacoustics: This work deals with psychoacoustic characteristics of different creep groan classes. Low-frequency groan, high-frequency groan, and transition groan classes are compared regarding loudness, sharpness, roughness, fluctuation strength, and tonality. Standard statistic methods as well as machine learning approaches are applied on signals from vehicle tests and half-axle tests. Test results depict the different characteristics of each creep groan class. By mapping the results to the subjective rating of trained test drivers, the annoyance of different classes is compared. Low-frequency groan, dominated by longitudinal axle vibrations, is found to be least annoying. This low annoyance is best depicted by the psychoacoustic parameters loudness and roughness. Presented results allow an optimization of brake system design to reduce creep groan’s annoyance, leading to higher customer satisfaction and a more goal-oriented treatment of this NVH problem.


Motivation
Creep groan is a severe brake noise, vibration, and harshness (NVH) issue that leads to costly warranty claims and maintenance work [2]. With the current trend towards electrified drivetrains, especially in battery electric vehicles (BEVs), masking drivetrain noise is reduced and lowfrequency brake noise such as creep groan is more and more relevant. The motivation to avoid or reduce creep groan is therefore high.
Creep groan is excited by a stick-slip effect within the friction partners (disk and pads) of a brake. It can be avoided by active measures, such as friction-normalization, e.g., by piezo actuators, or by passive measures, e.g., the modification of the brake pads' friction behavior [11,22]. However, a full mitigation of creep groan is usually not pursued as it is considered either too expensive or stands in conflict with other requirements on the brake pads, such as friction stability and fading resistance. Hence, engineers have to target a compromise, which implies the need for comparison between different setups in terms of creep groan annoyance.
In industry, this is currently done according to the VDA recommendation 314 [1]. Rating is done subjectively by the trained test drivers. Objective measures such as the A-rated sound pressure level (SPL) in the vehicle's cabin are obtained as well. However, this is a rather simple measure and reflects only a very limited picture of the human perception of creep groan.
Further insights towards the bifurcation behavior and classification of different creep groan classes were recently 1 3 given by Prezelj et al. [13], Smith et al. [15], and Huemer-Kals et al. [10]. It was found that several different creep groan classes with different basic frequency or different frequency contents occur, depending on the test setup. Differences in perception were so far not studied, although suggested by Prezelj et al. [13].
Relations between subjective annoyance, psychoacoustic quantities, and creep groan class are therefore highly interesting for a sophisticated and exact objective rating of creep groan in industrial applications. This paper shall clarify interactions between these aspects of creep groan.

Creep groan phenomenology and classification
Brake creep groan is a stick-slip-related low-frequency brake NVH phenomenon. This means that intermittent states of stick or slip between the friction partners disk and pads occur [2,7]. The stick-slip occurs due to differences between static and dynamic friction coefficient or a negative gradient of the friction coefficient over sliding speed. Therefore, creep groan is considered a physical instability, opposite to e.g. a dynamic instability like the flutter-type brake squeal [12]. While creep groan has first-order frequencies of only approx.  Hz, this strongly non-linear behavior leads to the occurrence of super-harmonic content, also in the well-hearable range [7]. As the stick-slip can only occur at smallest sliding speeds, e.g. during a set off from standstill, creep groan-related noise is not substantially masked by aerodynamic or engine noise. Perception is further characterized by the transfer of structural vibrations from each wheel towards the inside of the cabin, where large, soft panels are excited and finally transmit the vast majority of the perceived noise (in contrast to the airborne path), [6]. As each one of the four brakes can groan at the same time, interference and phase effects occur as well.
Creep groan vibrations are dominated by two different basic movements: a forward-backward movement of the whole axle and a rotational movement of the caliper and wheel carrier around the wheel's axis [10,14]. Mainly depending on the operating parameters brake pressure p B and vehicle speed v veh , different combinations and interactions of these movements can occur. Vehicle tests by Prezelj et al. [13] have shown four main creep groan classes at a double-wishbone front axle with floating caliper brake, namely: low-frequency groan (LF), transition groan with 2 (TG2) or with 3 peaks (TG3) per basic repetition cycle, and high-frequency groan (HF). Time signatures of tangential caliper accelerations for each of these creep groan classes can be seen in Fig. 3c. Prezelj et al. [13] found the first-order frequency of LF and TG2/TG3 groan in a typical range of approx. 18-22 Hz. Depending on the system, first-order frequencies of HF front axle groan can be found in a wider range of higher values, e.g., at 45 Hz [19] or at 97 Hz [20].
Huemer-Kals et al. [10] analyzed different creep groan classes and their operational deflection shapes (ODS) on a half-axle test bench. Compared to vehicle tests, HF groan was found very similar on the half-axle test bench. Low-frequency and transition groan classes, however, differed. Therefore, half-axle classification used a different nomenclature to that of the vehicle tests, namely LFA, LFB, and LFC groan. These three groan classes, with a basic frequency in the range of 21-23 Hz, were found to occur with a varying number of acceleration peaks per basic cycle as well. Hence, an additional number is added to the class name (e.g., LFA1 vs. LFA2 groan), describing this number of peaks per basic cycle. Table 1 summarizes the resulting half-axle classes within half-axle tests of Huemer-Kals et al. [10] and mentions the according, comparable vehicle creep groan class.
Due to the non-linear nature of creep groan, multiple stable vibration modes can be present at the same operating point. By varying the vehicle speed v veh at constant brake pressure, caliper acceleration RMS changes with a change in creep groan class [10]. This is also found in a simulative bifurcation study on a 3-DOF model by Smith et al. [15]. Most probably, these changes in amplitudes and frequency content affect the human perception of creep groan as well.

Subjective and objective rating
The German Verband der Automobilindustrie summarizes the acoustic evaluation of creep groan in vehicle tests (VDA recommendation 314, [1]). The proposed procedure divides into minimum and optional requirements. Minimum requirements consist of creep groan tests on a level road (with gear set to "D") and creep groan tests on a defined slope of 10-16%. These two scenarios are tested both with cold and with warm (T Disc = 50-100 °C) brake. During the tests, drivers shall rate subjectively between 1 ("annoying/ long/loud") and 10 ("not recognizable"). Objective rating shall be given by the maximum and average sound pressure level in dB A , measured in the middle of the vehicle slightly behind the gearshift. Zhang et al. [21] proposed a method for the objective rating of creep groan based on several different quantities: The peak-to-peak value Q 1 , the root-mean-square value Q 2 , the second-order moment Q 3 , and the fourth power vibration dose value of the pulse with largest amplitude Q 4 are calculated from the (logarithmic) tangential caliper accelerations within a defined time period T. Furthermore, cabin noise is evaluated in the form of the A-weighted sound pressure level SPL (A), the Zwicker loudness (as explained in chapter 1.2.3), the roughness, and the fluctuation strength. Within their conclusions, all of these quantities but roughness and fluctuation strength were found to effectively describe creep groan noise. This was based on a linear regression analysis between each quantity and the subjective rating, which occurred in a range from 4.5 to 8.5 on the above-mentioned scale. 29 sets of valid data were compared here.

Psychoacoustic features
Psychoacoustic features are used to quantify certain components of the human sensation of sound. Physical effects of the ear, such as temporal masking or a certain frequency behavior, are therefore considered. Psychoacoustic quantities are defined in international standards and can be computed with the Sound and Vibration Toolkit in NI LabView, as described by Huemer-Kals et al. [9]. Relevant quantities are explained in the following.
Loudness measures the sound intensity for a normalhearing listener. According to the Zwicker loudness algorithm, in accordance with ISO 532B, DIN 45631, and ISO/R 131, a stationary loudness value can be calculated [23]. This is done by separating the frequency contents into critical bands, which relate to certain areas of the inner ear's basilar membrane. Smoothing, weighting, and considering the transfer through outer parts of the ear finally lead to a loudness value given, e.g., on the linear Sone scale.
Sharpness quantifies the occurrence of high-frequency contents within a sound. Sharpness is measured in acum, with 1 acum defined as the sharpness of a 1 kHz narrowband sound at 60 dB. Within the present work, the sharpness according to Aures [4] is used, which considers influences of the total loudness, as well. Tóth [18] and Huemer-Kals et al. [9] explained a correlation between loudness and sharpness in creep groan signals.
Roughness and fluctuation strength describe effects coming from envelope-modulated sounds. Whereas the term fluctuation strength (in vacil) is used for modulated envelopes with a frequency < 20 Hz, the term roughness (in asper) describes envelopes > 20 Hz. Especially for frequency differences from 40 to 70 Hz, roughness is strongly experienced. With NI LabView, roughness is calculated according to Aures [5], in contrast to approaches presented by Sottek [16], Sottek and Genuit [17], or Fastl and Zwicker [8].
Tonality quantifies how well narrow-band noises can be distinguished within a sound or noise. Hence, the frequency bandwidth and the level of the narrow-band noise in relation to the background noise define tonality. Again, several approaches are common, such as Prominence Ratio, Toneto-Noise Ratio, or the (here used) approach according to Aures [4]. The used unit is tonality units tu.
In addition to the recent work of Zhang et al. [21], where loudness, roughness, and fluctuation strength were analyzed for creep groan, Abdelhamid and Bray [3] investigated loudness and tonality for creep groan. Both publications found high correlations to creep groan annoyance mainly for loudness, although the measurements were limited to 29/30 rated creep groan events, respectively.
Within the master thesis of Tóth [18], machine learning approaches for objective rating of creep groan were shown, based on the same data as this paper. Here, statistical features (mean/maximum/median) of psychoacoustic parameters as well as the normalized groan duration of 1145 brakings within vehicle groan tests were used as input for a Support Vector Machine (SVM) regression task. Subjective ratings inside the vehicle's cabin (from 1 to 10) were used as output layer. Predictions with an accuracy of down to 0.75 mean average error (MAE) were reached when using all input features of the microphone signal with an rbf-kernel, a C value of 31, and a Gamma value of 0.3. Fivefold crossvalidation (CV) was applied, and the CV mean MAE was 0.82, with a standard deviation of 0.15, indicating a rather robust regression result.

Scientific approach
This research paper tries to answer several questions regarding the interaction of subjective rating, psychoacoustic characteristics and creep groan class according to Fig. 1.
Precisely, these research questions are: • Question 1: How can each creep groan class be characterized by psychoacoustic quantities? • Question 2: How do psychoacoustic quantities relate to the subjective rating? • Question 3: How is the creep groan class related to the subjective rating?
To find answers to these questions, two different types of data were generated and analyzed: • Full vehicle test data, including subjective ratings (Question 1/2/3) • Half-axle test data (only Question 1, as there were no subjective ratings performed for the half-axle tests). The impact of test system size on psychoacoustic characteristics can therefore be studied as well.

Vehicle tests
Vehicle tests were performed on a compact executive car. Details on the procedure can be found within [13]. The test car, with double-wishbone axle at the front and multi-link rear axle, had floating caliper brakes on all four wheels. Two different friction linings were tested on the front axle, one set of European (ECE) linings and one set of Non-Asbestos Organics (NAO) linings. The rear axle was equipped with NAO pads throughout all tests. After a bedding procedure for creating stable friction characteristics, creep groan was produced both on a flat and an inclined track, with engine torque present at standstill through the automatic transmission. Driving direction and acceleration characteristic (from or into standstill) was varied. Each combination of parameters was tested five times, with test drivers rating the cabin noise on a scale from 1 (annoying/long/loud) to 10 (not recognizable), similar to the VDA recommendation 314 [1]. All in all, this resulted in 1145 brake applications, 910 of them subjectively rated.
Accelerations were measured at all four caliper anchor brackets, as schematically shown in Fig. 2. Also, cabin noise is measured by a microphone near the driver's head rest (MIC). As this microphone signal is naturally prone to unwanted noise from the cabin, such as engine noise, by-passing vehicles or also noise created by the test drivers, an equivalent, noise-reduced signal would be advantageous for evaluation. Therefore, FIR filter transfer functions between each accelerometer and the measured cabin noise were obtained by Least-Mean-Square (LMS) optimization, Fig. 2. By applying these transfer functions, the accelerometer-based equivalent sound pressure signal (EQV) is obtained. This procedure was already published by Huemer-Kals et al. [9].
Data was acquired with a sample rate of f S = 51.2 kHz. Envelope signals of each vertical caliper acceleration signal were calculated according to Prezelj et al. [13]. Such an envelope signal can be seen in Fig. 3a. As each stick-slip transition produces one local maximum in the envelope signal, peaks and therefore stick-slip transitions can be detected easily. Based on the local peak frequency's mean and standard deviation, the creep groan class was identified as given in Fig. 3b. After resampling, each 0.01 s window was assigned one of the following classes: • No groan (NG, no peaks found within the 0.01 s window) • Low-frequency groan (LF) • Transition groan with 2 peaks (TG2) • Transition groan with 3 peaks (TG3) • High-frequency groan (HF).
For each braking, psychoacoustic quantities according to Table 2 were calculated both for the cabin microphone signal (MIC) and the equivalent sound pressure signal (EQV). Afterwards, each psychoacoustic quantity was resampled from its initial output freuqency to the 100 Hz sampling of the classification.
This finally leads to the data structure shown in Fig. 4. Three columns exist here: The first column in Fig. 4a contains data with one scalar value per braking. Please note again that only 910 of these brakings were subjectively rated. The second column in Fig. 4b contains measured signals with 51.2 kHz sampling rate. The third column in Fig. 4c contains the creep groan class, psychcoacoustic quantities, and subjective ratings, resampled to 100 Hz.

Half-axle tests
Creep groan was reproduced on a half-axle test bench as seen in Fig. 5a. The left front wheel was driven by a drum and braked by the floating caliper brake. Test components (double-wishbone axle and the floating caliper brake system) were identical in design compared to the vehicle tests, although only ECE linings were used on the half-axle test bench. In Fig. 5b, the floating caliper brake system is shown, with the wheel removed for better visibility. Accelerations were measured on top of the caliper anchor bracket, similarly to the vehicle setup with a piezo-electric, triaxial accelerometer.
A bedding procedure ensured a stable frictional behavior between disk and ECE pads. Climate parameters were held at T amb = 30 °C and an average humidity of 11.58%rH during the tests. Different operating points of constant brake pressure 5 bar ≤ p B ≤ 30 bar and constant vehicle (drum) speed 0.1 km/h ≤ v veh ≤ 0.6 km/h were approached in the form of a full-factorial test matrix with steps Δp B = 5 bar and Δv veh = 0.1 km/h. As speeds were approached both increasing from and decreasing to 0 km/h, 72 operating points result.
Due to substantial background noise in the test bench cabin, measuring the creep groan noise by a microphone was not feasible. Instead, the transfer function between caliper anchor bracket accelerations and cabin sound pressure obtained from the vehicle tests was applied to calculate an equivalent sound pressure signal again. As only one wheel was tested on the half-axle test bench, only one accelerometer signal was input for the FIR filter transfer function. Before this, the accelerometer data measured at 10 kHz was upsampled to 51.2 kHz by linear interpolation. Then, one basic creep groan cycle was retrieved from each operating point. This basic creep groan cycle was repeated for 10 s. After applying the FIR filter transfer function on these 10 s acceleration signals, the psychoacoustic quantities according to Table 2 were calculated. Analogously to the procedure on vehicle test data, psychoacoustic quantities were resampled to 100 Hz output frequency. Finally, median values of these psychoacoustic quantities were computed only from the central second, from 4.5 to 5.5 s.  It shall be noted that the half-axle test data used within this paper is identical to the author's publication [10]. The evaluation of creep groan class is therefore identical to Table 1, and, due to the repetition of one creep groan cycle, the class stays constant over each one of the evaluated 72 operating points. Care was taken to apply identical color codes within both papers. Details regarding testing as well as additional wav-files and videos of the operational deflection shapes during the measured creep groan phenomena are available online. Figure 6 shows a bar plot of the classification result of vehicle tests, given in test seconds. Each bar shows the overall occurrence of the respective groan class, cumulated from the 0.01 s intervals. As in any scenario NAO pads were mounted on the rear axle, the front axle groan was found to be much more significant for the specific test vehicle. Therefore, only time intervals with  Fig. 5 a Half-axle test bench setup on the combined suspension and brake test rig; b floating caliper brake and accelerometer setup. Adapted from Huemer-Kals et al. [10] identical groan class at the front axle and no groan at the rear axle are considered, which are labelled as "relevant data" within Fig. 6. As one can see, these 11,804.0 s of relevant data consist mainly of "no groan" events (10,160.9 s). The rest of the relevant data is split on four creep groan classes, with a minimum of 87.4 s of transition groan with 3 peaks and a maximum of 1041.3 s of high-frequency groan. Table 1 shows the classification results for 7 of the 72 operating points within the performed half-axle matrix tests. The manual classification can further be found within the later presented (Fig. 11). Figure 7 shows a scatter plot over both parameters, based on a) the microphone signal and b) the equivalent sound pressure signal. Each scatter point represents one 10 ms interval. As one can see, for creep-groan-related loudness above approx. 20 sone, the sharpness increases almost linearly. The equivalent sound pressure signals show an even clearer picture here, resulting from the lower noise level.

Operational parameter: brake pressure in vehicle tests
A scatter plot of loudness over the current brake pressure is given in Fig. 8a for every 10 ms interval. On the second axis, the relative occurrence of brake pressure classes with a class width of Δp B = 1 bar is shown. Two brake pressure zones can be identified, one around 4-6 bar and one around 14-16 bar brake pressure. This is related to the two test track inclinations: flat and inclined. Higher loudness values are reached near 14-16 bar brake pressure. Nevertheless, no linear connection between brake pressure and loudness can be seen directly. Figure 8b shows boxplots and median values x of brake pressure for each creep groan class. LF groan has a significantly lower median brake pressure (6.4 bar) than the rest of the groan classes. Vehicle speed, the second main parameter for vibration power input, was not evaluated as the ultra-low speeds during vehicle testing were not measured. Figure 9 shows the psychoacoustic characteristics of different creep groan classes in vehicle tests, based on box plots of each 10 ms time interval during creep groan action at the front axle. Here, the equivalent sound pressure signal was used. Analogous evaluations based on the cabin microphone were performed: These generally showed similar trends with slight deviations due to higher background noise. Therefore, only the equivalent sound pressure results are presented.

Vehicle test results
Loudness over creep groan class is analyzed in Fig. 9a. Whereas "no groan" events show the lowest loudness median of 10.19 sone, the highest values can be found for transition groan with 3 peaks (TG3) and HF groan at approx. 13.3 sone.
Sharpness in Fig. 9b shows a similar trend, although with very small differences between the group medians of 0.13 acum overall. Fig. 9c was found to be highest for the transition groan classes TG2 and TG3, with slightly smaller median roughness for LF groan. HF groan is depicted as the lest rough groan class. "No groan" events showed a median value of almost 0 asper.

Roughness over creep groan class in
Similarly, fluctuation strength in Fig. 9d shows again almost 0 vacil for "no groan". Groan events were found to have an elevated fluctuation strength, with highest values for LF groan.
Tonality is given in Fig. 9e. HF groan events feature a high tonality median value, in contrast to all other classes.

Half-axle test bench results
1 s time intervals of the half-axle operating points according to Table 1 were analyzed. Psychoacoustic features of their equivalent sound pressure signals are given as boxplots in Fig. 10.
Loudness is shown in Fig. 10a. HF groan, both LFA classes (LFA1 and LFA2), as well as the last LFC3 example show slightly higher loudness than the rest. This trend is also found in the sharpness results of b) and relates to different input power per operating point, depending on vehicle speed and brake pressure (see again Table 1).
Roughness in Fig. 10c and fluctuation strength in d) show elevated values for the LFB2 class. The second LFC3 example shows elevated roughness with little fluctuation strength. Interestingly, high-frequency groan also has substantial roughness when compared to LFA1/2, LFB2, and the first LFC3 groan example.
Finally, tonality in Fig. 10e draws a picture in good accordance to the vehicle tests, with high tonality only for HF groan.
Results for the full creep groan matrix are shown in Fig. 11. Here, median values of the 10 s intervals are plotted for each psychoacoustic quantity, at each operating point, for each speed gradient (increasing vs. decreasing speeds from 0 km/h). Loudness in Fig. 11a,b shows a clear increase with higher vehicle speeds. However, the brake pressure p B , which is proportional to caliper accelerations, influences the loudness medians of the equivalent sound pressure signal only marginally. This behavior is also confirmed subjectively when listening to different sound sample wavfiles of the tests. This stands in contrast to the relation between loudness and brake pressure in the vehicle, where at least a certain increase with higher pressure was found, Fig. 8. Furthermore, one can see an influence coming from creep groan class: While HF and LFB classes seem to follow the same loudness trend, LFC and especially LFA classes are comparably louder. Fig. 11c,d shows similar trends as loudness due to the high correlation. However, the relative variation within sharpness values is smaller compared to the relative loudness variation.

Sharpness in
Roughness in Fig. 11e,f generally decreases towards higher speeds and lower brake pressures. Interestingly, LFA1 and LFA2 classes are found to be of lower roughness than neighboring HF operating points (with few exceptions). A generally higher roughness of lowfrequency or transition groan classes, as indicated by the vehicle test results, cannot be seen.
Fluctuation strength in Fig. 11g,h shows low values for HF groan. LFB2 groan has the highest fluctuation strength medians.
Tonality in Fig. 11i,j is increased only for HF groan, similar to the vehicle test data.

Question 2: psychoacoustic quantities vs. subjective rating
A comparison between subjective ratings inside the vehicle and the measured psychoacoustic quantities is given by box plots for front axle groan during full-vehicle tests in Fig. 12. Again, median values x are marked. Only vehicle tests are analyzed as there is no subjective rating available for halfaxle tests. Loudness over subjective inside rating is given in Fig. 12a. Generally, loudness increases with lower subjective rating (from right to left), although for ratings 7-8, this trend is inversed. Sharpness in b) behaves analogously.
Roughness in Fig. 12c shows an initial increase from subjective ratings 10 down to 8. Afterwards, a first plateau at approx. 1.2-1.3 asper is reached, which is held down to ratings of 5. For even lower subjective ratings (3 or 4), roughness medians rise up to 1.5 asper.
Fluctuation strength in Fig. 12d shows a different trend: Coming from highest levels at perfect subjective rating (10), FS decreases down to ratings of 6 and then increases again towards lower ratings. This could relate to the occurrence of LF and TG classes, as shown in Fig. 14 of the next chapter.
Similarly, tonality in Fig. 12e shows rather low levels, with an increased "bump" at ratings 5-6 and an increase for very low subjective ratings of 3.
The impact of different psychoacoustic features towards the subjective rating was further studied based on the already mentioned machine learning approaches of Tóth [18]. According to Fig. 13, trained Support Vector Machine (SVM) machine learning models were used. Each machine learning model mapped four different psychoacoustic input features to the subjective inside rating; see, e.g., the results for a model with MIC input signal in Fig. 13a. After the training phase, each psychoacoustic feature was then varied from its minimum to its maximum input value, while the other psychoacoustic features were kept constant at their mean values. The varied feature's impact was then quantified by two parameters, see Fig. 13b: negative mean gradient and maximum absolute difference. The higher these parameters were, the higher the change of the resulting annoyance prediction and, therefore, the higher the impact of the investigated feature.
For the microphone signal, Fig. 13c shows results for two different models. High gradients/absolute differences can be found especially for the parameters loudness, tonality, and groan duration, while fluctuation strength and roughness seem to have lower influence in both models. Please note that sharpness was omitted due to the known collinearity with loudness.
For the equivalent sound pressure signal, Fig. 13d shows results for two different models. In this case, loudness, roughness, and normalized groan duration had highest impacts, as the high absolute differences and negative mean gradients imply. Figure 14 shows the cumulative groan duration per class over the subjective inside rating. Therefore, each 10 ms time interval's creep groan class was compared with the subjective inside rating of the respective braking. HF groan dominates the rating scores from 4 to 6, while LF groan can be found predominantly at rating scores 5-9. Transition groan classes TG2 and TG3 occur mainly at scores 4-6 as well. Many data points were not assigned a subjective inside rating due to missing ratings ("no data" bars).

Question 3: creep groan class vs. subjective rating
To further compare the subjective inside ratings with the according front axle creep groan class, normal distribution of the data was checked by a Q-Q plot of the residuum of each creep groan class' mean rating. Due to the strong deviations to a normal distribution, a Kruskal-Wallis test (instead of a one-way analysis of variance) is used in the following. Figure 15 shows results of the Kruskal-Wallis test between subjective rating and creep groan class. In this case, per-braking data were analyzed: The dominant creep groan class, meaning the class that occurred the longest during each braking, was assigned to the whole braking action. If creep groan did not occur at all, "no groan" was assigned. Figure 15a shows box plots. TG3 groan leads to the lowest median rating of 4, whereas "no groan" events can be found with a median value of 9. Figure 15b shows a comparison of the resulting mean ranks and standard deviations of the different creep groan groups based on per-event-data. While due to the high standard deviation/low amount of data for TG3 groan, no clear difference between TG3 and TG2 as well as between TG3 and HF groan can be seen, HF groan was still found to induce lower subjective ratings than TG2 data. At the same time, "no groan"-and LF-dominated brakings had similar, clearly better subjective ratings than the rest. This also fits to the general trends of Fig. 14, except probably the small difference between LF and "no groan". Certainly, assigning the main creep groan class to the whole braking favors this outcome.
Similar analysis was performed on the time-interval data as given in Fig. 16. Here, the (per-braking) subjective rating was assigned to all time intervals of one braking. By this approach, all groups but TG2 and HF were found to have significantly (95% confidence) different mean ranks according to Fig. 16b. Again, TG3 showed the worst subjective rating and LF had the best subjective ratings.

3
Due to lower median brake pressures for LF groan, as shown in Fig. 8b, separate Kruskal-Wallis tests for brakings with a median brake pressure > 10 bar (during creep groan vibrations with the respective dominant groan class) were performed; see Fig. 17. Both the per braking and the per 10 ms window analyses show a clearly better rating of LF groan compared to other creep groan classes.

Question 1
Different creep groan classes are perceived differently, because they can be easily differed by hearing. This is also implied by their different psychoacoustic behavior as Fig. 13 Feature impact analysis with SVM machine learning approach. Evaluation of negative mean gradient and maximum absolute difference of each normalized quantity for microphone and equivalent sound pressure signal. Adapted from Tóth [18] 1 3 summarized in Table 3. Deviations between vehicle tests and half-axle tests already occur due to the different classes found. While a distinction by the amount of acceleration peaks per basic cycle was sufficient for vehicle tests, halfaxle tests showed several different classes with identical amount of acceleration peaks. Regarding psychoacoustics, amount of acceleration peaks was found to be less important than the actual groan class for LFA creep groan (e.g., LFA1 behaves more similar to LFA2 than to LFB1). LFB1 and LFB2, however, differ more strongly.

Question 2
Distinctive connections between psychoacoustics and the subjective inside rating were found in vehicle tests. Loudness was found to generally increase with worsening  Fig. 16 Groan class vs. subjective inside rating: Kruskal-Wallis test per 10 ms window with 95% confidence rating, with the exception of one step between rating 7 and 8. Sharpness, clearly correlated to loudness for creep groan signals, showed qualitatively identical behavior. Roughness increased in the form of two levels towards lower ratings as well. Fluctuation strength and tonality were increased at ratings were specific creep groan classes were found, e.g., led HF groan to higher tonality at ratings from 5 to 6. The presented results underline the high importance of loudness and roughness towards the subjective impression of creep groan. This is also supported by an analysis of feature impact of a machine learning regression model.

Question 3
Regarding the annoyance of different creep groan classes, Kruskal-Wallis tests were performed on both per-braking data and interpolated 100 Hz vehicle test data. Depending on the input data, more or less clear differences between the groups were found. While low-frequency groan (LF) was found to be the least annoying groan class, transition groan with 2 peaks (TG2), high-frequency (HF) groan, and transition groan with three peaks (TG3) were found to be increasingly annoying. On 95% significance level, HF groan was also worse than TG2 for both data inputs. For TG3 groan, variance within the data was too high to rank its annoyance compared to HF and TG2 data in per-braking data. When analyzed in 10 ms intervals, however, it led to the worst subjective rating.
These results imply a difference in annoyance of creep groan classes. Based on vehicle test data, low-frequency (LF) groan was found to be clearly the best-rated creep groan class, having also lowest loudness and rather little roughness. However, this does not automatically imply a design target towards this creep groan class, as annoyance is also related to the operating point in terms of brake pressure p B and vehicle speed v veh : As indicated by the half-axle tests, LF groan occurs mainly near very low speeds and pressures, where input power and therefore also intensity of groan are rather low. Nevertheless, statistical tests exclusively for creep groan events at p B > 10 bar confirm the lower annoyance of LF groan.
Half-axle tests delivered additional clarification. Here, HF groan actually showed lower loudness than LFA1/ LFA2 groan at higher speeds. LFA creep groan, however, was not found in vehicle tests so far, and hence, no direct link can be drawn here. The best-performing class in terms of the crucial parameters loudness and roughness was the LFB1 creep groan, which can be associated with LF groan in vehicle tests.
Therefore, brake design should target for lowfrequency groan with only one stick-slip transition per basic cycle for a reduced creep groan annoyance. This could, e.g., be reached by the tuning of elastomer  bushings or other axle components' stiffness/damping/ mass parameters. An application of a tuned mass damper could be possible as well. So far, no experimental tests exist for such approaches, which would have to consider numerous design conflicts with vehicle dynamics and safety. Nevertheless, this approach could be feasible when both high comfort/perception of quality and friction performance are required. Eventually, the presented objectification of creep groan noise can be used for objective rating methods in industry. Compared to simple A-rated sound pressure levels, psychoacoustic parameters depict the human sensation of creep groan more accurately. By collecting additional data with other vehicles and configurations, robustness and value of the presented conclusions could be increased. Conducting further hearing tests with trained as well as ordinary persons could further enlighten the influence of different psychoacoustic quantities on subjective annoyance.