1 Introduction

Our experience of the world is multimodal and integrated across different sensory modalities, yet the sensory and perceptual processing is thought to occur in distinct unisensory brain areas prior to the later integration in multisensory areas. In addition, the studies of perceptual phenomena across multiple sensory modalities alike are preoccupied with the neural underpinnings and computational models of processing characteristics, largely ignoring the phenomenology and qualitative aspects of perceptual experience (Albertazzi 2015a). The predominant focus on processing also takes for granted the nature of information that drives our perceptual systems and the insights concerning the structure and primitives of perceptual experience (Cutting 1987; Albertazzi 2015b, 2018).

Here, we explore the aesthetic qualities related to the fractal-scaling characteristics in a variety of visual and tactile surfaces. Fractal-scaling is a mathematical term to describe irregular geometrical structures whose shape appear to be self-similar regardless of the level of magnification at which it is viewed. Fractals are created by an infinite recursion of a simple process and their appearance embodies the structural complexity of dynamic systems that create them: from the erosion of coasts to the growth of crystals and formation of galaxies. Importantly, unlike many visual and tactile objects in conventional psychophysics, fractals are non-Euclidian and defined as shapes “made of parts similar to the whole in some way” (Mandelbrot 1977). Fractal-scaling properties reflect the relative structural density at coarse and fine spatial scales thus capturing the relational structural qualities of both natural and synthetic patterns, images and surfaces that they are associated with.

Despite both the high ecological validity of fractal structures and the prominent role that the scale-specific processing plays in various sensory modalities, the fractal-scaling framework is largely absent in both research and theory of perceptual processing and experience. Here, we apply this framework to directly compare the generalizability of aesthetic qualities of fractal surfaces and their similarity across the visual and tactile sensory domains. We do so in the complementary contexts of natural scene statistics and scale-specific sensory analysis to explore the potential of fractal geometry as a structural primitive for the cross-modal aesthetic experience. Remarkably, the aesthetic evaluations of fractal-scaling properties are highly similar across the two sensory domains, suggesting the strong link between fractal-scaling statistics and perception of dynamic, expressive and affective aspects of sensory stimulations in different modalities.

2 Nature, Natural Scene Statistics and Aesthetic Appeal

From ragged mountain ranges to calm, pristine lakes—natural scenes come in a tremendous variety of different forms and sizes (Fig. 1). Despite this diversity, the underlying spatial structure of natural scenes exhibit a remarkable degree of statistical consistency. In an image of natural scenes, one typically finds that nearby regions are more similar in their spatial properties, such as luminance intensity, chromaticity and orientation, compared to more distal regions.

Fig. 1
figure 1

Examples of images of natural scenes. All scenes shown are vastly different in their superficial appearance, but they share a common underlying statistical structure

One commonly used method of representing the distance-dependent variations in the intensity of individual points in natural scenes is through the shape of their spatial frequency amplitude spectra. The three rows in Fig. 2a depict the original and spatial frequency filtered versions of natural scenes from Fig. 1. When natural scenes are decomposed into different spatial frequency components as illustrated in the middle (low spatial frequency filtered images) and bottom rows (high spatial frequency filtered images), one finds that the relative amplitude of intensity variations is inversely related to the spatial frequency (f) as illustrated in Fig. 2b. The power law relationship, defined by the function 1/fα is characterised by the amplitude spectrum slope (α) which, on average, ranges from .8 to 1.5 (peaking at 1.2) for images of natural scenes (Field and Brady 1997). This particular property is thought to reflect the scale invariance of natural scenes, or the notion that approximately equivalent amounts of spatial structure can be found as we zoom in or out between the coarse or fine spatial scales (i.e. low and high spatial frequency respectively).

Fig. 2
figure 2

a The top row shows the original natural scene images from Fig. 1. The middle and bottom rows show each image filtered for low and high spatial frequencies, respectively. b Amplitude spectra of the original images

The ubiquity of amplitude spectra characteristics across natural images has fuelled a rising interest in how they are processed by the human visual system (Cutting and Garvin 1987; Knill et al. 1990; Tadmor and Tolhurst 1994; Billock 2000; Hansen and Hess 2006). Spehar et al. (2015) used a series of psychophysical methods to measure subjects’ discrimination and detection sensitivity to images with a range of spatial frequencies and amplitude spectrum slopes. Consistent with the previous studies (Cutting and Garvin 1987; Knill et al. 1990; Tadmor and Tolhurst 1994; Hansen and Hess 2006) they found the best visual discrimination sensitivity for ‘natural’ amplitude spectrum slopes compared to shallower and steeper amplitude spectrum slopes. This finding was further supported by fMRI analyses in which blood oxygen level dependent (BOLD) responses peaked for images with natural 1/f amplitude spectra (Isherwood et al. 2017). Our visual system’s tuning and adaptation to these fractal-scaling statistics offer a strong account for how diverse natural scenes are identified and processed so efficiently.

The natural 1/f amplitude spectrum has also been considered a potential factor that drives aesthetic appeal. This idea has been reinforced by findings that a wide range of artworks from different genres, periods and geographical origin share remarkable commonalities in their spatial image structure—mirroring that found in natural scenes (Redies 2015; Redies et al. 2007a, b). However, in order to eliminate the potentially confounding effects of semantics attached to artworks, a number of studies have also used synthetic filtered noise images with parametrically varying amplitude spectra characteristics. Examples of synthetic images with parametric variations in their amplitude spectrum slope (α) are shown in Fig. 3. Spehar et al. (2016) presented these synthetic images in a two-alternative forced-choice (2AFC) paired comparison task, in which subjects indicated which of two images of different α they preferred. Repeated variations of this procedure have found a robust average preference for images with intermediate amplitude spectrum slope values that are most characteristic of natural scenes (Spehar et al. 2003, 2015, 2016; Viengkham and Spehar 2018).

Fig. 3
figure 3

Synthetic 1/f noise images with α falloffs at the values indicated in each inset

Explanations as to why there is such a robust visual preference for images within the natural statistics range are diverse and allude upon notions such as the inherent universal preference for intermediate complexity (Berlyne 1970, 1971), processing fluency (Reber et al. 2004), and an evolved sensitivity to the low-level statistics of natural scenes (Spehar et al. 2015; Isherwood et al. 2017). Different theoretical accounts notwithstanding, it is also interesting to consider a somewhat neglected question of what aspects of variations in the amplitude spectrum could be driving the aesthetic preference in these images. Namely, the decomposition of an image into its spatial frequency components through Fourier analysis is unselective in that everything in the stimulus is transformed and then reconstructed. However, while it is often assumed that visual processing is generally optimised to use all avaliable cues to understanding the structure of the environment, it is highly unlikely that everything in a given spatial frequency range is equally informative. While the initial focus of our own work (Spehar et al. 2015, 2016) and that of others (Graham and Field 2007; Graham and Redies 2010) were the overall photometric characteristics of images, it is possible that some other scale-specific spatial variations could be playing a role as well.

3 Photometric Versus Geometric Properties of Natural Scenes

The scale invariance of natural scenes can also be captured by a geometric scaling parameter known as the fractal dimension (D) which focuses on the boundary edge between the paint-filled regions and empty regions in an image. One frequently used method to quantify fractal dimension, the box-counting technique (Fig. 4), performs the scaling examination by covering an image with a mesh of identical squares (‘boxes’) of varying side lengths (L). The technique simply counts the number of squares, N, that contain part of the boundary edge. This count is repeated for increasingly small squares within the mesh. Reducing the box size (i.e. smaller values of L) is equivalent to examining the image at finer spatial frequencies and N assesses the amount of space containing the pattern boundaries at these spatial scales. The scale-invariance of the fractal pattern appears through the power law relationship N ~ (1/L)D (where the exponent D is the fractal dimension) and is quantified by plotting log N as a function of log (1/L). While a detailed comparison of the two techniques can be found elsewhere (Bies et al. 2016; Fairbanks and Taylor 2011; Spehar and Taylor 2013), the amplitude spectrum slope alpha and fractal dimension D values are inversely related: higher alpha value is equivalent to low D value and vice versa.

Fig. 4
figure 4

Demonstration of the box counting technique at three different values of L. As L decreases from left to right, the number of boxes (N) needed to measure the length of a boundary edge increases following a power law relationship defined by D

It is important to emphasise that fractal dimension calculations are always performed by considering the degree of spatial variations along the edges of binarized, black-and-white regions in an image. Therefore in order to apply a box-counting procedure on the synthetic images varying in their amplitude spectra characteristics as illustrated in the first column in Fig. 5, these images are first thresholded with respect to their mean luminance. During the thresholding procedure, all pixels that have a value higher than the mean luminance are assigned as white and all pixels with values below the mean are assigned as black resulting in images shown in the second column in Fig. 5. Edge only image variations are generated by extracting the edges from the thresholded black-and-white images as illustrated in the third column in Fig. 5.

Fig. 5
figure 5

Variations of 1/f synthetic noise images. From left to right columns, this figure shows the original grayscale image, the thresholded (or two-tone) image, the edge extracted image and the 3D terrain model. All images increase in their measured fractal dimension (D) from top to bottom

Conventionally, the fractal-like variations in grayscale images, due to their surface-texture appearances, are referred to as ‘two-dimensional’ (2D) fractals, whereas the thresholded and edge patterns are referred to as ‘one-dimensional’ (1D) fractals because their fractal characteristics are determined by variations in fractal contours. In addition, the same type of fractal variations can be extended to additional dimensions as illustrated with the fractal terrains depicted in the last column in Fig. 5. Here the intensity values of pixels in the grayscale images are plotted as the terrain height, creating three-dimensional (3D) surfaces.

It is important to emphasise that the thresholding, edge extraction and height transformation procedures alter the measured photometric and amplitude spectrum slope values of the corresponding derived images. However, their geometrical, fractal-scaling properties remain essentially identical, making these image types and image transformation procedures well-suited for investigating perception and aesthetics of fractal patterns (Isherwood et al. 2017; Taylor and Spehar 2016). Indeed, because the D value charts the ratio of coarse-to-fine geometrical structure in a pattern, it can be considered a powerful and generic measure of visual complexity generated by repeating patterns. Previous studies have consistently found that the perceived complexity of natural images, paintings and synthetic patterns all increase linearly with greater measured D values (Bies et al. 2016; Spehar et al. 2003; Viengkham and Spehar 2018).

4 Multisensory Fractal Aesthetics

While the role of fractal statistics in vision has generated considerable attention, their potential role in other sensory domains is a question of both theoretical and applied interest. For example, in the auditory domain, it has been found that melodies generated such that their notes follow a 1/fα distribution with specifically defined α values close to the intermediate, ‘natural’, range are regarded as the most melodic and music-like compared to melodies with a random distribution (Beauvois 2007; Voss and Clarke 1978). Similarly, random white noise adjusted with a 1/fα function enable the production of different ‘colours’ of noise. Pink noise is created when α is close to 1 and a greater proportion of power is distributed to lower frequencies, resulting in sound similar to many natural phenomena, like crashing waves and rain (Gardner 1978). Although results suggest the similar role of fractal-scaling characteristics across different senses, this question remains underexplored without direct comparisons between specific sensory domains.

In particular, the role of fractal-scaling characteristics has been under-explored in the tactile modality. This is, to some extent, paradoxical given that 3D surfaces rendered with fractal-scaling characteristics are known to produce remarkably accurate and convincing models of natural textures and forms (Pentland 1984). While, some studies have used 3D-like models of textures and terrains, these were typically viewed on computer monitors and never engaged physical interaction (Padilla et al. 2008; Pentland 1984; Spehar et al. 2016). However, much like vision and sound, aesthetic experiences can also be induced tactilely such as in the sensation of pleasant touch (Juricevic 2009). Tactile input requires an active and intimate exploration between the perceiver and stimuli, and can induce strong emotional experiences that subsequently shape the perceivers’ affect and aesthetic experience (Gallace and Spence 2011). Given the promise of fractal-based approaches in both visual and auditory modalities, extending these methods to our sense of touch is a logical advance in empirical aesthetics. This is further afforded with the recent advance and accessibility of 3D printing technology, enabling generation of fractal surfaces with great accuracy and upon a variety of materials.

5 Current Study

In the current study, we sought to investigate aesthetic preferences in visual and tactile sensory modalities using stimuli generated with equivalent fractal scaling statistics. We aimed to extend the findings of previous investigations that have tended to consider the preference with different types of synthetic fractal images in isolation from each other and, almost exclusively, only in the visual modality. We used a parametric manipulation of the fractal-scaling characteristics in a variety of one-dimensional (edges only and two-tone images), two-dimensional (grayscale) and three-dimensional (3D printed surfaces) patterns to directly compare the generalizability of aesthetic qualities of fractal surfaces and their similarity across the visual and tactile sensory domains. By keeping the fractal-scaling characteristics identical across a range of distinct and visually dissimilar image types (grayscale, two-tone black and white, edges only, three-dimensional terrains), we explore the potential of fractal geometry as a structural primitive for the aesthetic experience in visual and tactile sensory modalities.

While the correspondence between vision and touch has been shown with the parallels between tactile and visual illusions (Geldard and Sherrick 1972), cross-modal associations (Spence 2011) and cross-modal integration (Woods and Newell 2004; Lunghi et al. 2010; Lunghi and Alais 2013), these studies do not directly address the cross-modal correspondence regarding the experienced perceptual qualities between the two modalities. A rare exception is the work by Albertazzi et al. (2016) who have demonstrated a cross-modal association between specific tactile perception attributes such as warm-cold, smooth-rough, lightweight- heavy, soft-hard and the visual experience of abstract paintings. Our aim is to further explore the aesthetic geometries of vision and touch with a choice of fractal scaling as a relational, spatial-scale bound, geometrical primitive.

6 Method

6.1 Design

The experiment used a within-subjects design with stimulus type (grayscale, thresholded, edges, tactile) and level of fractal dimension (D) as the independent variables. Preference for different levels of fractal dimension was measured separately within each stimulus type using a two-alternative forced-choice, paired-comparison design.

6.2 Participants

A total of 51 undergraduate psychology students (66.67% female, mean age = 19.86, SD = 5.89) participated in the study in exchange for course credit. Participants gave informed written consent and all experimental procedures were approved by the UNSW Human Research Ethics Advisory Panel (#2660).

6.3 Materials and Apparatus

6.3.1 Visual Stimuli

All variations of visual stimuli used in this experiment are shown in Fig. 6. Grayscale images were generated in MatLab by creating a 512 × 512 grid of random pixels (with values between 0 – 255) selected from a Gaussian distribution. A Fast Fourier was then performed to create a series of amplitude spectra at nine different levels of α falloff (α = .5, .75, 1.0, 1.25, 1.5, 1.75, 2.0, 2.25 and 2.50). An inverse Fourier transform applied each amplitude spectra to the 512 × 512 Gaussian noise image, resulting in images possessing specific desired  α values. To obtain the thresholded variations, the same grayscale images are bisected at the mean luminance value. All pixels below the mean luminance value become one solid dark colour and everything above it becomes a solid lighter colour.

Fig. 6
figure 6

Examples of the grayscale, thresholded and edge variations of the synthetic fractal images. Respective input amplitude spectrum slope (α) and average measured fractal dimensions (D) as displayed in the bottom two rows

Edges only images were created from the thresholded images by an edge extraction procedure, resulting in solid light lines on a dark background for the edge image variations. For each amplitude spectrum slope listed above, the respective fractal dimension values of the images are D = 1.97, 1.95, 1.82, 1.65, 1.41, 1.20, 1.10, 1.05 and 1.01. Mean brightness and RMS contrast for all images were controlled at 126 and 0.15, respectively.

6.3.2 Tactile Stimuli

Models of the 3D tactile stimuli were generated in MatLab following the three steps illustrated in Fig. 7. In the first step, similar to the visual stimuli, a series of 1/f noise images varying in amplitude spectra characteristics were created (Fig. 7, top row). The seven different amplitude specturm slope values were created (α =1.25, 1.50, 1.75, 2.00, 2.25, 2.50 and 2.75) and RMS contrast for all images was controlled at .15. In order to create 3D solids (Fig. 7, middle row), the tonal values of each pixel in the image was translated to correspond to a specific height in three-dimensional space (height increased as luminance of pixel value increased towards 255). These models were 3D printed onto one face of a 10 × 10 × 1 cm hard, synthetic block (Fig. 7, bottom row). The equivalent estimated D values of these surfaces are 2.82, 2.65, 2.41, 2.20, 2.10, 2.05 and 2.01 respectively (based on the measured D of the initial grayscale image).

Fig. 7
figure 7

Images of grayscale images (top row), computer-generated 3D solids (middle row) and the final 3D printed tactile surfaces (bottom row). αi is the amplitude spectrum slope used to create the corresponding original 1/f amplitude spectrum grayscale image, D2D is the fractal dimension of the grayscale image and D3D is the approximate dimension of the final 3D tactile surface

All visual stimuli were presented on a 22.5-inch VIEWPixx LCD screen. Tactile stimuli were presented on a tabletop behind an adjustable metal occluder.

6.4 Procedure

Following the collection of written consent, participants were seated and given a brief set of verbal instructions. The order in which visual or tactile stimuli were presented was pseudorandomised such that an equal number of participants received one or the other first. We used a two-alternative forced-choice (2AFC) method to measure preference across the different levels of fractal dimension within all stimulus types. Visual stimuli (grayscale, thresholded and edge images) were presented via a custom MatLab code and viewed on a monitor adjusted to the participant’s eye-level and situated approximately 50 cm away (subtending a visual angle of 15°). The room lighting was dimmed during the completion of the visual portion of the experiment. For each image type, participants were shown with two images of different fractal dimension values side-by-side and instructed to indicate which they preferred. Choice was indicated via presses of the appropriate response keys on a keyboard and there was no set time limit. Given the 9 unique levels of fractal dimension, a total of 72 paired judgements were made for each variation of the visual stimuli. This resulted in 216 trials for the visual portion of the experiment.

Tactile stimuli were presented manually by the experimenter on a tabletop and behind an occluder. Throughout the session, participants were not able to see the stimuli but were instructed to reach behind the occluder and explore pairs of textures with their hands. The preferred texture out of the pair was pushed forward and recorded. Each pair was kept behind the occluder until a response was given, after which they were withdrawn by the experimenter and placed in a box hidden from participants’ view. This was procedure was repeated until all 42 paired comparison trials (given the 7 levels of fractal dimension) were completed.

In sum, participants evaluated 258 paired trials over the course of the experimental session, which took approximately 30 min to complete.

7 Results

7.1 Analysis of Population Preferences

7.1.1 Preferences for 1/f Noise Images

Preference was indexed by the proportion of times a stimulus was chosen when it was presented. Average preference as a function of fractal dimension for each variation of visual stimuli is shown in Fig. 8.

Fig. 8
figure 8

Average preferences as a function of fractal dimension for the three variations of visual stimuli (grayscale, thresholded and edges only)

A repeated-measures analysis of variance (ANOVA) revealed no main effect of image type of preference, but a significant main effect of fractal dimension, F(1.846, 8.710) = 4.719, p = .002, and a near significant image type/fractal dimension interaction on preference, F(3.484, 174.208) = 2.398, .060. When averaged over image type, preference was greatest for images with intermediate fractal dimension values of 1.20 (M = .601, 95% CI [.566, .636]) and 1.41 (M = .601, 95% CI [.558, .644]). Preference scores for both 1.20 and 1.41 D images were significantly greater than those with D values of 1.01 (M = .459, 95% CI [.373, .545), 1.05 (M = .488, 95% CI [.418, .558]), 1.82 (M = .356, 95% CI [.435, .551]), 1.95 (M = .418, 95% CI [.356, .481]) and 1.97 (M = .346, 95% CI [.274, .419]).

These results support previous findings showing a robust curvilinear preference function across fractal dimension, as well as the similarity between these preference outcomes between different image types (Spehar and Taylor 2013; Spehar et al. 2016).

7.1.2 Preferences for 1/f Tactile Surfaces

Average preference for tactile stimuli across each level of fractal dimension is shown in Fig. 9. A repeated-measures ANOVA revealed a significant main effect of fractal dimension on tactile preference, F(1.805, 8.507) = 44.783, p < .01. Average preference scores decreased linearly, generally with preference for each greater value of D being significantly lower than the previous. In short, preference scores were the highest for stimuli with D values of 2.01 [M = .783, 95% CI (.712, .853)] and 2.05 [M = .719, 95% CI (.653, .785)]. Both 2.01D and 2.05D stimuli were preferred significantly more compared to all other levels of D (p < .01), however they did not differ significantly from each other (p = .116). Overall, participants preferred surfaces that were smoother and this preference decreased linearly as surfaces became rougher.

Fig. 9
figure 9

Average preferences as a function of fractal dimension for tactile stimuli

It is obvious that the average preference for fractal-scaling variations with real three-dimensional surfaces does not follow an inverted-U shape, as was observed with the visual patterns. However, based on the present data, there is not enough evidence to claim that the function relating the fractal-scaling variations and preference are qualitatively different across the two sensory domains. Namely, while the experimental stimuli produced to investigate the preference for fractal-scaling variations across the two domains are qualitatively similar, it is impossible to be certain that they span exactly the same range of the fractal-scaling variations in the two different domains. One must note that while they shared the similarities in the amplitude spectral characteristics of the seed grayscale stimuli, we did not compare the extent to which the stimuli across the two domains were similar in their perceived roughness or complexity. In other words, it is possible that the range of the perceived variations in the complexity or roughness between the visual and tactile domains was different. In particular, the surface with the lowest fractal dimension in the tactile domain was far from being completely smooth or flat, making the range of the fractal-scaling variations more compressed in the tactile domain, ranging from the very rough to only the intermediate roughness.

7.2 Latent Dimensional Structure in Visual and Tactile Preferences

7.2.1 Visual Preference Factors

While the average preference functions between vision and touch differed, similar latent variables could underlie how fractal-scaling variations affect preferences in both stimulus modalities.

To examine this, we performed an independent principal component analysis with Varimax rotation for each variation of the visual stimuli (Fig. 10). Across all three image types, two major factors with eigenvalues greater than 1 emerged. For grayscale images, the two factors accounted for a cumulative 90.99% of preference variance. Factor 1 characterised a curvilinear component with strong positive loadings on D values of 1.65 and 1.82 and strong negative loadings on values of 1.01 and 1.05. Factor 1 can be described as a component that captured the intermediate-simple dimensions of preference and accounted for 45.57% of preference variance. Factor 2 also characterised a curvilinear function, but with strong loadings at 1.10, 1.20 and 1.41, and strong negative loadings at 1.95 and 1.97. It can be described as a component that capture the intermediate-complex dimensions of preference and accounted for approximately 45.42% of total preference variance. The nature of the factors extracted from both thresholded and edge image variations were remarkably similar to that of the grayscale factors. Two factors with eigenvalues greater than 1 were extracted, which, together accounted for 87.35% and 86.57% of total preference variance for thresholded and edge only images respectively (Fig. 10). Furthermore, extracted factors in both image variations can be described as characterising an intermediate-simple and an intermediate-complex preference component.

Fig. 10
figure 10

Loadings for each level of fractal dimension as labelled by Factor 1 and 2 of a PCA for grayscale, thresholded, edge and tactile stimuli

7.2.2 Tactile Preference Factors

We performed a principal components analysis with Varimax rotation to examine the latent dimensions in tactile preferences as well. Two major factors with eigenvalues greater than 1 emerged for tactile preferences. Together, the two factors accounted for 86.85% of total preference variance. Factor 1 captured the rough-smooth components with strong positive loadings on D values 2.65 and 2.82, and strong negative loadings on D values of 2.01 and 2.05. Factor 1 corresponded to preference driven primarily by whether the surface was rough or smooth and accounted for 58.71% of preference variance. Factor 2 characterised a curvilinear function with strong positive loadings on D values of 2.20 and strong negative loadings on values of 2.65 and 2.82. This corresponded closer to an intermediate-rough dimension of preference patterns and accounted for 28.14% of preference variance.

7.3 Intra-Individual Stability of Preferences Across Stimuli

7.3.1 Correlational Analyses Between Variations of Visual Stimuli

To investigate the stability of individual preferences, we calculated the Pearson correlation coefficient between preference scores on the three sets of visual stimuli for each participant. The distribution of individual correlation coefficients between visual stimulus pairs are shown in Fig. 11. On average, individual preference was positively correlated across all stimulus types. Preferences between thresholded and edge variations showed the greatest intraindividual stability, Mr= .695, 95% CI [.605, .785]. Furthermore, correlation coefficients between thresholded and edge images showed that 39.21% of subjects had preference correlations of .8 or higher and 74.5% had correlations of .4 or higher. More modest positive correlations were also found between grayscale and edges images, Mr = .194, 95% CI [.010, .379], as well as grayscale and thresholded images, Mr = .230, 95% CI [.041, .419].

Fig. 11
figure 11

Left figure is a box-and-whiskers plot showing the distribution of correlations between the three variations of visual stimuli (grayscale [GS], thresholded [TH] and edges [ED]). + indicates the mean correlation coefficient of each distribution. Right figure shows the frequency distribution of individual pairwise correlations across all pairs of visual stimuli

7.3.2 Correlational Analyses Between Visual and Tactile Preferences

As equal data points for each stimulus type are required for the correlational analysis, the data for the two lowest FD levels (1.01 and 1.05) of the visual stimuli were omitted. The omission of these particular values was intentional. Based on the average preference data, the perceived correspondence between visual and tactile stimuli increased as D increased. In contrast, as D decreased in both domains, the variations in roughness at each level of D asymptotes at a higher level in the tactile stimuli compared to the visual stimuli. That is, while the visual stimuli at the lowest level of D appear quite simple, the tactile surfaces still maintain a relatively bumpy and irregular appearance. By omitting the scores from the two lowest levels of D in the visual stimuli the stimuli across the visual and tactile domains were more perceptually matched. Pearson correlation coefficients were then calculated between visual and tactile preference scores for each individual. The distribution of individual correlation coefficients is shown in Fig. 12. Overall, correlations between tactile and visual stimuli were positive but much weaker compared to correlations between each of the visual variations. Average correlations were the greatest between tactile and thresholded images, Mr = .334, 95% CI [.152, .516], and tactile and edge images, Mr = .267, 95% CI [.069, .465]. Average individual correlations between tactile and grayscale images were the lowest of the three, Mr = .135, 95% CI [−.073, .343].

Fig. 12
figure 12

Left figure is a box-and-whiskers plot showing the distribution of correlations between each of the visual stimuli (grayscale [GS], thresholded [TH] and edges [ED]) and the tactile stimuli [T]. + indicates the mean correlation coefficient of each distribution. Right figure shows the frequency distribution of individual pairwise correlations

8 Discussion

The current study examined average and individual aesthetic preferences for variations in fractal-scaling characteristics across two modalities: vision and touch. In the visual modality, three variants of 1/f synthetic noise images were used: one-dimensional thresholded and edge images and two-dimensional grayscale images. The tactile modality was assessed using three-dimensional variations of the 2D grayscale images, converted into a depth map and printed onto physical three-dimensional surfaces. Variations in the fractal structure for both stimulus domains was achieved by varying the input amplitude spectrum slope used to generate the seed grayscale synthetic noise images. Consequently, the fractal dimension of the seed 1/f noise patterns and subsequent variations were similarly parametrised.

The primary aim of our study was to determine which fractal-scaling characteristics across the one-, two- and three-dimensional stimuli were considered the most aesthetically appealing and whether the underlying dimensional structure of preference patterns was consistent across the two examined sensory modalities. We found for the visual one- and two- dimensional stimuli, images with measured D values in the intermediate 1.20–1.41 range were the most preferred compared to both higher and lower D values. This curvilinear preference persisted across all three of the image variations and was consistent with previous findings (Spehar et al. 2003, 2016; Spehar and Taylor 2013).

On the other hand, average preferences for the three-dimensional tactile stimuli followed a linear pattern. Preference decreased linearly with increasing fractal dimension values and smooth surfaces with 2.01D and 2.05D were the most preferred. This supports a consistent finding in both the visual and tactile domains in which smooth contours are preferred over sharp contours (Bar and Neta 2007) and increases in tactile roughness typically leads to a decrease in perceived pleasantness (Etzi et al. 2014, 2018). It is proposed that smoother textures are regarded as less threatening, more comforting and generally more pleasant to the touch.

We also examined the consistency of fractal-scaling preferences across the three dimensions of stimuli, and the two sensory modalities of vision and touch. Previous research has found a great degree of individual consistency in preference between the one-dimensional image variants (thresholded and edge images), as well as between one-dimensional images and two-dimensional grayscale images (Spehar and Taylor 2013; Spehar et al. 2016). Here again, we found remarkable similarities in the latent dimensional structure extracted across all variations of the visual stimuli, as well as positive average correlations of intraindividual preferences.

Whereas both one- and two-dimensional stimuli were perceived in the visual domain, the inclusion of three-dimensional variations required the engagement of active tactile perception. Intraindividual preference correlations between tactile and visual stimuli were positive, but not as strong as correlations between some of the visual variations. This finding is expected as both the average shape and main factor extracted from tactile preferences primarily supported a linear, smooth-rough relationship between fractal-scaling and aesthetic preference.

However, this does not necessarily reject the notion that visual and tactile domains have no common base in how fractal-scaling statistics influence aesthetic perception. Indeed, it is possible that the range of variations in the complexity and roughness between the visual and tactile domains differed; in other words, the stimuli between the two sensory domains were not cross-modally matched. Low D tactile stimuli still maintained a slightly rough and irregular surface, leading to a more compressed range of variations in the tactile domain compared to the visual. Furthermore, the results from our principle components analysis indicated a small, but significant proportion of 1/f surface preference variance was driven by an intermediate-polar component. While a majority of subject preferences are distinguished via differences in surfaces based on the smooth-rough dimension, there were also those whose preference patterns are better accounted for by whether the surface is intermediate in roughness or on the polar ends of the smooth-rough spectrum. Assuming that surfaces within the intermediate D range are reflective of the most ‘natural’ textures, this could be suggestive of a natural-artificial factor for preference. However, the precise nature and specific textural qualities subjects are drawing upon to make their preference judgments, remains unclear. Additionally, interindividual differences in aspects like tactile sensitivity and frequency exposure to natural real-world surfaces may also play a part in determining preference.

Fractal-scaling statistics, particularly those of a geometric basis, have a clear and established role in the aesthetic perception of visual stimuli. In both the cases of one-dimensional fractal patterns and two-dimensional noise images, measures like fractal dimension reliably correspond to the perceived complexity and aesthetic judgements of these visual stimuli. Presently, the same conclusions cannot yet be made regarding the tactile perception of three-dimensional fractal surfaces. While preferences in the visual and tactile sensory domains correlated positively and both shared similarities in the underlying dimensions extracted from their fractal-scaling and preference functions, the robustness of fractal-scaling statistics across modalities requires further testing and clarification. While the current study provides a novel, exploratory investigation into the commonalities between the sensory domains of vision and touch, we hit upon several pertinent inquiries regarding how the perception of fractal-scaling statistics may be equated across the two modalities and whether the stimuli we utilised was a sufficient representation of the entire possible range. It is conceivable that the D range used for the visual and tactile stimuli in the experiment were not ideally matched. For example, the lowest D values for the tactile surfaces may not be perceptually equivalent to the lowest D images and vice versa for the high D variations. If the D range were to be extended to include the even smoother or even rougher textures, the congruence in preferences across modalities may be better matched across the sample and within individuals. Nevertheless, our results suggest the fractal scaling as a common aesthetic primitive in visual and tactile sensory domains.