Introduction

The past decades have witnessed continued interest in the question of the effects of unconscious visual stimuli on human sensory perception and behavior (Erdelyi, 1974; Eriksen, 1960; Dixon, 1971; Holender, 1986: Kihlstrom, 1987; Merikle & Daneman, 1998; Logothetis, 1998; Breitmeyer, 2015). Attempts to answer that question have deployed a variety of strategies to manipulate visual awareness, including visual masking, attentional distraction, ambiguous figural information, motion-induced blindness, and visual crowding to name a few (see reviews by Kim & Blake, 2005; Lin & He, 2009). Paramount among those strategies has been binocular rivalry (BR), a very popular technique for transiently abolishing visual awareness (for an overview of work on and ideas about BR, see reviews by Walker, 1978; Wolfe, 1986; Alais & Blake, 2005, 2015; Blake et al., 2014). During BR, a visual stimulus viewed by one eye is temporarily suppressed from awareness by simultaneous presentation of a dissimilar stimulus viewed by the other eye. Variants of BR that entail interocular suppression have also been introduced, including flash suppression (Wolfe, 1984) eye-swap BR (e.g., Logothetis et al., 1996), and generalized flash suppression (Wilke et al., 2003).

A new, highly potent form of interocular suppression, dubbed continuous flash suppression (CSF), burst on the scene 17 years ago (Fang & He, 2005; Tsuchiya & Koch, 2005), gaining pre-eminence as a tool for manipulating awareness (see Fig. 1A). In its original instantiation, CFS was induced by presenting to one eye a dynamic sequence of complex, colorful images (successive arrays of Mondrian-like geometric figures) updated at a rate of 10 Hz pitted against a single, smaller, weaker target image presented to the corresponding retinal area of the other eye. The list of stimuli that successfully generate CFS has since expanded to include dynamic arrays comprising random sequences of binary visual noise (Fang & He, 2005), random sequences of natural scene images (Kim et al., 2017), pointillist natural images (Cha et al., 2019), overlapping letters (Eo et al., 2016), and temporally and spatially filtered noise images (Han et al., 2016; Han & Alais, 2018). Regardless of the type of image sequence used to induce CFS, this form of interocular competition has the potential to suppress the target stimulus from awareness for many seconds at a time. Indeed, one of the strengths of CFS is the prolonged duration of visual suppression it produces – up to several tens of seconds, compared to just several seconds with BR (Blake et al., 2019). Another benefit is the ability to dictate the initially dominant percept, which is always the dynamic sequence. This control over awareness at onset allows immediate assessment of the impact of the suppressed target viewed by the other eye (e.g., Rothkirch et al., 2012). In addition, CFS tends to uniformly suppress relatively large targets whereas during BR, large stimuli tend to be experienced in piecemeal fashion comprising intermingled portions of each eyes’ views (Blake et al., 1992).

Fig. 1
figure 1

Summary of CFS publications. To assess impact, we compiled a database of abstracts, dissertations, review articles and empirical studies from the period 2004 to 2019. Publications were extracted from Google Scholar and were only included if CFS was a main topic or was used as a technique. A The number of CFS publications per year. There is a monotonic increase in the number of CFS publications per year, projected to reach approximately 100 CFS publications within the next 2–3 years. B The proportion of CFS publications sorted by disciplines (bottom panel), estimated from publication titles using MATLAB’s inbuilt clustering and word cloud algorithms (top panel). Although CFS was first developed within the basic vision sciences to manipulate visual awareness, it has been adopted in a variety of research domains (e.g., multisensory perception, affective neuroscience, eye-movement control, contrast gain-control, attention) and in diverse subject populations (young healthy adults, developmentally challenged people, individuals diagnosed with psychosis)

These benefits afforded by CFS for the study of awareness and visual suppression have spawned a flood of papers that have deployed the technique to study problems within diverse fields of basic and clinical science, as summarized in Fig. 1B. This growing popularity of CFS has also created something of a hodgepodge of results, including replication failures that lead to conflicting conclusions. This situation arises, in part, from the lack of systemized procedures for creating CFS which, in turn, introduces variability in CFS suppression strength and, hence, the incidence of target awareness. Although CFS is generally more effective than BR, its strength of suppression as gauged by the average duration of target invisibility varies substantially among individuals (e.g., Blake et al., 2019; Stein et al., 2011). It is therefore not unusual for studies to exclude participants or trials with insufficient suppression periods or partial suppression when viewing CFS, but such post-hoc data selection can adversely impact the generality of CFS results. A review of the CFS literature reveals exclusion rates ranging from 14 % (Korisky et al., 2019) to 50 % of tested participants (Sklar et al., 2012), which is a liability that compromises experimental efficiency and external validity of results. In addition, post hoc data selection may underestimate the true level of awareness in the remaining trials or participants (an effect referred to as regression bias), which in turn may lead to spurious claims of ‘vision without awareness’ in this subsample (Rothkirch et al., 2022; Shanks, 2017). It is worth noting that, besides suppression duration, there are alternative ways to define CFS effectiveness and target awareness. For example, CFS effectiveness can be gauged using forced-choice judgment tasks where performance requires registration of some aspect of the target stimulus such as a brief increase in its contrast (Tsuchiya et al., 2006) or its spatial location relative to a fixation mark (e.g., Sklar et al., 2012). In a similar vein, oculomotor reflexes have been used to gauge residual effectiveness of targets suppressed during CFS (e.g., Rothkirch et al., 2012). While these measures are more objective, they do not mitigate concerns arising from post hoc data selection.

The typical motivation for using CFS is to render a target invisible, so a preferable alternative to post hoc data selection would be to optimize procedures to generate effective suppression from the outset. Although a seemingly simple solution, it is not always clear what steps are sufficient to achieve that optimization, and brute-force strategies that have been used in earlier work introduce problems of their own. Part of the variability in CFS efficacy stems from its sensitivity to eye dominance, where stronger target suppression is typically obtained when the target is presented to the non-dominant eye (Yang et al., 2010). Studies keen on exploiting this characteristic would want to assess eye dominance prior to CFS testing, but there is currently no consensus or guideline about which eye dominance test to utilizeFootnote 1 (see also Ding et al., 2018). Compounding this difficulty is the variability in CFS potency when attempting to suppress certain categories of target stimuli. For example, some studies (Moors et al., 2016; Yang et al., 2007) have found that suppression durations for images of faces tend to be relatively brief (e.g., around 2–3 s). Similarly, contrast thresholds for perceiving a moving target are more strongly impacted when the CFS masker itself incorporates motion and not just flicker as characteristic of a standard Mondrian sequence (Moors et al., 2014). In fact, some features of multidimensional targets, including color (Hong & Blake, 2009) and flicker (Zadbood et al., 2011), may resist CFS suppression even when form information defining the appearance of those targets is erased from awareness. This kind of differential susceptibility of stimulus qualities during CFS, dubbed ‘fractionation’ (Moors et al., 2017; Zadbood et al., 2011), complicates interpretation of results from CFS studies of putatively assessing unconscious visual processing. As an aside, one strategy used to induce more potent CFS suppression is to reduce target contrast to ensure greater robustness of target invisibility. But without appropriate control conditions (Blake et al., 2006), this manipulation may overly degrade the target signal and exaggerate the apparent strength of suppression, to an extent that null effects are inevitable (cf. Sterzer et al., 2014). Finally, even when effective CFS is induced initially, its strength tends to wane over successive trials and sessions (Blake et al., 2019; Kim et al., 2017; Ludwig et al., 2013), limiting the amount of testing that can be conducted for each participant.

These challenges in optimizing suppression effectiveness complicate existing concerns about how one defines “unawareness” when using CFS (Moors, 2019; Moors et al., 2016; Moors & Hesselmann, 2018), and whether or not observed “unconscious” effects can be attributed to low-level or high-level stimulus properties (Moors et al., 2017). Work aimed at elucidating the underlying mechanisms of interocular suppression report distributed processes along early and late stages of visual processing (Nguyen et al., 2003; Sterzer et al., 2014), including those underlying BR (Tong et al., 2006) and CFS (Han & Alais, 2018; Yang & Blake, 2012; Yuval-Greenberg & Heeger, 2013). These distributed processes may contribute to involvement of both general (e.g., high-level) and feature-selective (i.e., low-level) components of interocular suppression (Han & Alais, 2018; Stuit et al., 2009). Such distributed involvement, in turn, may differentially impact suppression of certain stimulus features or components to a lesser extent than others, potentially driving the conclusions of unconscious visual processing and stimulus preservation in CFS. In other words, these differences may affect the residual extent of unconscious visual processing (Sklar et al., 2012) and the type of stimulus properties that survive suppression, which may explain why some CFS findings, such as dorsal stream preservation (Almeida et al., 2008, 2010; Fang & He, 2005), are not supported by some behavioral (Rothkirch & Hesselmann, 2018) and neuroimaging studies (Hesselmann & Malach, 2011). Differential interference along hierarchically organized processing streams may also explain why certain stimuli or stimulus properties are difficult to suppress, e.g., fearful facial expressions (Hedger et al., 2015) and temporal content (Zadbood et al., 2011), and why high-level priming effects can occur in certain CFS regimes (Gelbard-Sagiv et al., 2016).

From the above considerations emerges a clear lesson: our understanding of the consequences of unconscious processing during CFS would benefit from a standardized procedure for the assessment and statistical analysis of awareness (Rothkirch & Hesselmann, 2017), and a more principled, standardized design of stimulus content and mode of presentation for effective CFS suppression. With regards to the latter, some key spatiotemporal parameters have been shown to affect CFS effectiveness (Han & Alais, 2018; Moors et al., 2014; Yang & Blake, 2012; see also review by Pournaghdali & Schwartz, 2020) and interocular suppression (Alais & Parker, 2012; Fahle, 1982; O’Shea et al., 1997; Stuit et al., 2009; Yang & Blake, 2012), presented briefly as follows. In sum, processes engaged by the target need to be given due consideration when making decisions on the mask properties. Within the spatial domain, more effective suppression is obtained when the mask has a similar spatial frequency (Yang & Blake, 2012), chromatic (Hong & Blake, 2009) and orientation (Han & Alais, 2018) content to the target. The presence of sharp edges in discretely updated CFS images potentiates masking effectiveness (Baker & Graf, 2009; Han et al., 2018; Han et al., 2021), and more structurally complex targets are better suppressed by masks with higher image entropy, e.g., using pattern elements with textured, pink noise fill (Han et al., 2021). Smaller mask sizes are recommended when narrow orientation bandwidths are used (Han et al., 2019), because like in BR (Blake et al., 1992), the increased collinearity is likely to produce more perceptual mixtures with larger mask sizes. Temporally, ensure that the mask has a similar temporal frequency to temporal modulating targets (Han et al., 2018; Han & Alais, 2018), or a lower frequency when paired with static targets static (Han et al., 2016).

Manipulating and matching target-mask spatiotemporal properties allows us to use a higher target contrast while still achieving robust suppression. However, fine control of these properties requires an involved knowledge of image processing, spectral analysis, and spectral filtering. Currently, there are available resources for generating Mondrian noise patterns (Hebart, n.d.) and for building CFS experiments (Nuutinen et al., 2018). Critically lacking, however, are the means for systematically creating CFS test and masking stimuli tailored to selectively engage neural mechanisms with specific spatio-temporal properties. To promote standardization of this crucial ingredient of CFS studies, we introduce in this paper CFS-crafter, an open-source app that allows novice users and experts to analyze and manipulate spatiotemporal content of CFS stimuli with a rudimentary knowledge of spectral analysis and spectral filtering. In disseminating this app, we envision its use would facilitate more effective stimulus choices, better matched target in practice, and a means to quantify CFS stimuli and thus clarify the implications of suppression of those stimuli for unconscious visual processing.

Description of the CFS-crafter

The image processing toolbox in MATLAB provides a means to obtain fine control over the spatiotemporal attributes of any image sequence. However, usage of the toolbox requires detailed knowledge of image processing techniques such as filtering procedures and frequency domain operations, which may present an obstacle to novice users and experts interested in precisely controlling or standardizing CFS stimuli (for example, to match the spatial or temporal content of target and mask). The CFS-crafter app simplifies this challenge. Designed to be as streamlined as possible, the app is a point-and-click interface that incorporates spectral and non-spectral image processing strategies. Not to be confused with a tool for presenting CFS stimuli or running CFS experiments, it can be used to generate animated CFS masks, to customize their properties and to conduct analyses on targets (i.e., monocularly viewed test stimuli to be suppressed) and CFS masks (animations viewed by the other eye). Although effective usage of the app requires a rudimentary understanding of spectral analysis and spectral filtering, help files are provided in the app to support novice users and experts to achieve stimulus design and analysis that allow them to tailor stimuli for their specific experimental question.

The open-source app can be downloaded from GitHub (https://github.com/guandongwang/cfs_crafter), available as a MATLAB application (file named “cfs_MATLAB.mlappinstall”) used with later versions of MATLAB (e.g., R2020a) or as a standalone application (file named “CFS-crafter Windows Installer.exe” for PC users and “CFS-crafter Mac Installer.app” for Mac users). The process of installation is straightforward for both versions of the CFS-C. For the MATLAB application, and an installation prompt is provided within MATLAB when the installation file is loaded (Fig. 2A, left panel). The installed program is then loaded onto MATLAB’s Apps menu, providing easy access to the app’s functionalities. In the standalone version, users will be prompted to install MATLAB Runtime (Fig. 2A, left panel), a freely available library that allows users to operate compiled MATLAB applications without a license. When the installation process for MATLAB runtime is complete, the standalone version will be in a CFS_Crafter folder under Applications on a Macintosh operating system (or Mac OS) and under the Program Files folder on a Windows operating system.

Fig. 2
figure 2

Setting up the CFS-crafter. A Once the application file is loaded, the user can install the CFS-crafter onto MATLAB’s App tab. The standalone version prompts the installation of MATLAB runtime, which allows the CFS-crafter to be used as a regular program, without a MATLAB license. As a standalone program, the CFS-crafter appears in the Applications directory on a Mac OS. B After the users have entered the display screen information in the homepage, they can choose to create a mask sequence (Creation), construct a sequence from a pre-selected set of images (Conversion), customize their existing sequence (Modification) and/or analyze their CFS stimuli set for spatial, temporal or orientation content. Information about these modules is also provided in the help button

Figure 2B describes the homepage of the CFS-crafter. Here, the user is given the option to specify the properties of the testing display (e.g., screen resolution and refresh rate), the properties of the CFS mask (e.g., duration and size) and the viewing distance from observer to testing display. Although the CFS-crafter can extract and prefill some of the current display properties (e.g., resolution), the user should review these parameters before performing any tasks with the app, especially if the experiment is being conducted with a different display from that used with the CFS-crafter. The homepage allows the user to indicate the type of task required and, depending on the selection submitted, a different interface will be initiated for mask generation, stimulus conversion, customization, and stimulus analyses. To demonstrate the use of these functionalities, we now walk through three ways an experimenter can match or evaluate the spatiotemporal content of targets and CFS masks. As several technical terms will be used in this paper, a glossary explaining each of these terms will be provided in Appendix 1.

The temporal content of the dynamic sequence

Suppose an experimenter needs to suppress a static, chromatic target from visual awareness. To match the processes engaged by the chromatic target, which are likely to be more parvocellular in nature (Derrington & Lennie, 1984; Shapley et al., 1981), the experimenter could use a mask with low temporal frequency content. A common and straightforward method of generating a low temporal frequency mask is to vary the number of pattern updates in each second (Tsuchiya & Koch, 2005; Zhan et al., 2019; Zhu et al., 2016). This captures the number of changes in the mask’s spatial pattern per second, and is usually quoted units of Hertz, but it is not equivalent to temporal frequency. Pattern update rate is a coarse temporal metric that usually overstates the true temporal frequency value because it ignores the variation in luminance over time. To illustrate this point, Fig. 3A tracks the real-time luminance changes of a single pixel in a greyscale, dynamic Mondrian sequence with a pattern update rate of 10 Hz. Instead of modulating at a rate of 10 Hz, the pixel’s luminance varies randomly with each new pattern configuration, occasionally trending in the same direction for several frames (low frequency change) or reversing direction from frame to frame (high frequency), or even having a constant value between one pattern and the next. The temporal frequency spectrum is therefore complex and variable. Moreover, it varies with the number of grey levels used in the Mondrian pattern. Including more grey levels increases the proportion of smaller luminance changes, resulting in lower modulation rates (Fig. 3B). Decreasing the number of grey levels, on the other hand, increases the proportion of luminance reversals between frames and increases the effective modulation rate. A particular pattern update rate, when quoted in pattern changes per second, is therefore not necessarily comparable across studies and may explain the inconsistency in reported ‘optimal’ update rates for CFS, e.g., 3–12 Hz (Tsuchiya & Koch, 2005), 6 Hz (Zhu et al., 2016) or 4, 6, and 8 Hz (Zhan et al., 2019). These values also likely overestimate the optimal luminance modulation. Indeed, using CFS animations composed of pink noise images (patterns with spatial frequency spectra more closely approximating spectra comprising natural images), peak CFS strength occurs around 1 Hz (Han et al., 2016). Understanding the underlying luminance modulation rate of CFS animations is important for inferring likely neural mechanisms, because high and low modulation frequencies more strongly activate magnocellular and parvocellular neurons, respectively (Skottun & Skoyles, 2008).

Fig. 3
figure 3

The difference between pattern update rate and temporal frequency. A Pixel luminance timeline. A pattern that updates every 100 ms would achieve a maximum temporal frequency of 5 Hz if, an only if, there were a luminance reversal every update. This rarely occurs, however, as consecutive luminance values will sometimes trend in the same direction, producing a complex, unpredictable step-function in luminance that creates a waveform modulating at a lower temporal frequency (right panel). The pixel outlined in red in successive animation Patterns 1 and 2 transitions from very dark to very light, as indicated by the first two bars in the histogram in the left-hand part of Panel A, but as shown by the luminance variations in that pixel over the next eight changes, the magnitude and polarity of luminance change is unpredictable within this 1-s period. B The temporal profile of pixel timelines varies with the number of grey levels used to generate the Mondrian sequence. Decreasing the number of grey levels increases the proportion of larger luminance changes between pattern updates (right panel). This broadens the temporal frequency spectrum and raises the likelihood of a maximum 5-Hz frequency modulation being achieved (although it remains a low probability). In contrast, increasing the number of grey levels increases the likelihood of slower, trending changes in luminance. This biases the temporal frequency spectrum to low frequencies and makes the likelihood of 5-Hz modulation exceedingly rare. C To accurately represent the color and temporal information of a mask sequence, all sequences processed by the CFS-crafter must be in a 4D matrix. The FFT is used in spectral analyses, where the spatial frequency of each image is represented by the x and y-axes, and the temporal frequency of each pixel timeline is represented by the z-axis

One goal of the CFS-crafter is to allow users to quantify and manipulate spatial luminance modulations and, thereby, create effective, well-defined stimuli that, in turn, will yield data that allow more definitive conclusions about the mechanisms promoting CFS masking. To serve this purpose, the CFS-crafter uses spectral filtering and spectral analytic techniques. Based on Fourier’s theorem that any complex time series can be decomposed into a sum of sinusoids (Brigham & Morrow, 1967), the periodic luminance content of a pixel in a CFS animation can be extracted with the fast Fourier transform (FFT). This transforms the time series (i.e., the animation sequence) into a frequency representation and averaging the FFTs of a selection of pixels provides a good estimate of the animation sequence’s temporal frequency content. This temporal frequency spectrum can be analyzed or manipulated with temporal filters (i.e., to boost or to attenuate certain frequencies) and then back-transformed with an inverse Fourier transform to a time series in which the animation has a known temporal frequency content. The CFS-crafter represents mask sequences as four-dimensional (4D) matrices in MATLAB’s .mat file format; these matrices represent the spatial (x, y), temporal (z) and color content (red, blue, and green layers) of the sequences. Spectral content is extracted from the sequence using a 1D FFT, where temporal information is represented by the z-axis and the spatial information of each image in the mask sequence is represented by the x and y-axes (Fig. 3C).

There are two ways to create a temporally low-pass CFS animation with the CFS-crafter. In the first approach (illustrated in Fig. 4), one starts with a ready-made dynamic sequence that the user wishes to customize into a temporally low-pass sequence. This sequence can be provided as a 4D matrix (in the same format as the sequences generated by the CFS-crafter) in the Modification interface (Fig. 4A), accessed by selecting the Start button on the homepage. Here, the user specifies the location of the 4D matrix (choose mask directory) and selects the mask property to be manipulated. Modification options include spatial frequency and temporal frequency of the CFS sequence, as well as orientation and the phase of the spatial patterns (e.g., scrambling phase to remove edges and form). The user can also provide a set of pre-selected pattern images, but this will require constructing the 4D matrix with the CFS-crafter in the Conversion interface (Fig. 4B), before performing any modification procedures. In this interface, the user can specify the image input directory and desired mask parameters such as pattern update rate and RMS contrast (defined as the average standard deviation of pixel luminance across the whole sequence). Note that the CFS-crafter scales all pixel luminance values within a range of 0 to 1, and clipping would occur for sequences with larger RMS contrasts (e.g., > 30 %). The user is advised to evaluate the resulting images in the preview page (Fig. 4C), which will load when the conversion process is completed. The CFS-crafter also resizes in the input images according to the desired width and height, allowing the user to submit images of any size. However, to avoid unwanted changes in image aspect ratio, we recommend reviewing and cropping input images where necessary. In the preview page, the user is allowed to review individual frames of the mask sequence and to obtain basic mask information. Satisfied with the converted mask sequence, the experimenter can save the sequence as a .mat file, close the conversion interface, and proceed to the Modification interface.

Fig. 4
figure 4

Modifying existing image sequences with the CFS-crafter. A The Modification interface provides four types of spatiotemporal manipulations. In this example, the user is opting to conduct a low-pass temporal filter on a pre-saved 4D matrix representing the mask sequence. Filter options include log-Gaussian, Butterworth, Gaussian, or ideal filter. B Users can generate a 4D matrix using the Conversion interface. To do so, add a pre-selected set of images, enter the desired parameters such as the mask update rate and initiate the conversion process with the Start button. C A preview page will be loaded after every major operation (e.g., completion of customization or conversion process). The page contains details about the generated sequence (e.g., angular size), and the user can screen the individual image frames in the sequence. Depending on the type of operation performed, the user can choose to save the sequence as a .mat file, image files or as a mp4 video clip. As the current example previews the results of converting image files into a CFS sequence, the option for exporting image files is disabled. D To demonstrate the effect of a Gaussian filter and an ideal filter, we compare a low-pass, Gaussian spatial filter (𝜎 = 3) with a low-pass, ideal spatial filter. Although both filters have a cut-off frequency at 5 cycles per degree (cpd), the ideal filter has a sharper transition between filtered and unfiltered frequencies. This sharper transition comes with a cost: compared to the Gaussian filter, images processed by an ideal filter have a higher probability of ringing artefacts (left panel). Ideal filters will produce no artefacts with noise images as those images contain no structured phase information (i.e., they are phase incoherent). In the temporal domain, large changes in pixel luminance (e.g., black to white) may occur between patterns. When processed with an ideal filter, spurious temporal modulations in the form of ringing artefacts will occur

The following steps will generate a temporally low-pass mask sequence using the Modification interface. Users first select the Temporal option, after which they can opt to extract a range of temporal frequencies (bandpass), high frequencies (high-pass), or in this example, low frequencies (low-pass). These operations can be performed using either an ideal filter that fully preserves the amplitudes of selected frequencies (passband) and excludes out-of-range content (van Drongelen, 2018), or smoothed-edged filters such as the Gaussian or Butterworth filter (see examples in Fig. 4D). In practice, smooth-edged filters are recommended for all but noise images as they reduce the probability of ringing artefacts that tend to occur with ideal filters, although they do blur the demarcation between filtered and unfiltered frequencies, as the shape of these filters are gradually ramped on and off over a defined range of frequencies. The user is therefore advised to determine the choice of filter type according to the needs of the experiment. To help facilitate this decision, information is provided for each type of property manipulation (i.e., spatial, temporal) and these details are accessible through the help button in the Modification interface. After entering the desired parameter selections, the user can initiate the customization process by selecting the Start button. A preview page will load once the customization process is complete, and the user can review the individual images of the mask sequence by manually entering specific frame numbers into a dialog box, using the slider or by selecting the Play button. Basic mask information, such as the image’s angular size (expressed in units of degrees of visual angle subtended at the retina) and its duration, is also provided in the preview page. The user can choose to save the temporally low-pass filtered mask sequence as a .mat file or as an mp4 video clip (Fig. 4C). To represent the modified temporal information accurately, every frame in the sequence needs to be exported. Image formats such as .jpeg are therefore disabled for temporally filtered sequences to avoid performance issues, but they remain available for spatially manipulated sequences.

In the second approach, the user does not have ready-made sequences or pre-selected pattern images (see Fig. 5). Mask creation is a supported functionality in the CFS-crafter, and it can be accessed by selecting the Start button from the homepage. This selection will load the mask generation interface (Fig. 5A), where the user is provided with several mask options, including patterns composed of geometric shapes, spatial noise images (e.g., white or pink noise) and patterns composed of natural object elements. The creation of the former two types of pattern sequences is straightforward; select the desired mask type, pattern characteristics, mask parameters (e.g., mask update rate) and click the Start button. In general, patterned images are composed of randomly sized elements located randomly around the mask area (method from Hebart, n.d.). Each element is also set to one of five colors (i.e., black, red, green, blue, or yellow) or one of six different grey levels (i.e., 0, 20, 40, 60, 80 and 100% of the maximum luminance).

Fig. 5
figure 5

Creating new masks with the CFS-crafter. A The Creation interface allows the synthesis of several types of masks according to desired mask parameters (e.g., mask duration). In this example, the user has opted to make a masking pattern comprising face elements. B A tracing interface will load when the user clicks on the Start Tracing button. Here, the user can trace up to five face images with an elliptical tracing method. Completed traces will appear as a row in this interface, and the user is free to delete any unwanted traces before submitting them with the Finish button. C The mask generation process is initiated when the user selects the Start button on the mask generation interface. Upon completion, a mask (leftmost panel) with varying face sizes, spatial locations, and contrast would be obtained. To illustrate the variety offered by the CFS-crafter, other forms of possible mask types are shown, and these include colored geometric patterns and spatially pink noise

Unlike geometric shapes, there is no simple mathematical equation that allows the generation of natural objects in the CFS-crafter. Therefore, an extra step of object extraction is required to generate patterns with object elements. Using MATLAB’s assisted tracing methods, face elements are extracted with an elliptical trace and other forms of objects are extracted using freehand tracing. A user wishing to create a dynamic pattern sequence composed of facial elements (akin to Han et al., 2021) would select Traced Item > Face > Start Tracing to load the tracing interface (Fig. 5B), which permits up to five image traces. To trace a face image (.jpg, .jpeg, .png or .tiff formats), the user must specify the input directory under Choose source image and select the Start Tracing button. The face image would be presented as an interactive object, and the user is now able to position an elliptical mask over the desired portion of the face image. Completed traces are submitted by double-clicking on the interactive object, and the user is allowed to delete and replace these extracted objects. Traces that have yet to be submitted can be deleted by Right Click > Delete Ellipse or by pressing the Esc button on the keyboard. When ready to generate the face element mask, the user selects the Finish button which closes the tracing interface. The user is now free to initiate mask generation with the Start button on the mask generation page. As before, a preview page would be loaded when the mask generation process is completed, and the user would need to save the mask sequence as a .mat file before proceeding with temporal manipulation in the Modification interface.

The spatial content of the dynamic sequence

The spatial qualities of a Mondrian sequence can be quantified with its spatial frequency content. Like temporal frequency, spatial patterns of pixel luminance across an image can be decomposed into sinusoidal components with an FFT. The spatial information of each image is represented by x and y dimensions in the 4D spectrum (Fig. 3C), which describe the distribution of phase structure of different spatial frequencies at different spatial orientations. As higher and lower spatial frequencies are typically linked to edges and solid areas, respectively, the spectral spatial content of mask images presents opportunities for fine-tuned spatial manipulations. This motivates the use of FFT techniques as the mainstay strategy for spatial manipulation in the CFS-crafter.

Following our example with the chromatic static target, one may also wish to enhance suppression effectiveness by matching the spatial qualities of the mask and target. Assuming a narrowband, high spatial frequency target, one can remove the low spatial frequency components of the Mondrian images to create a spatially high-pass mask. This is achieved by selecting the Spatial frequency component in the customization interface which, like the temporal frequency manipulation, will present different filter options. The user can now choose the type of filter (e.g., Gaussian, Butterworth) and the range of frequencies to preserve (e.g., high-pass or a band of higher spatial frequencies). Alternatively, the user may be interested in how mask pattern edges affect the suppression of a spatially broadband chromatic target. Here, the user would preserve the mask’s broadband spatial frequency content (to match the target’s spatially broadband content) and would manipulate the spatial coherence of the mask images using the Phase Scramble component of customization interface. Under this selection, the user can specify a range of spatial frequencies for the randomization operation or opt to randomize the phase structure of all frequency components. If the user specified a range of frequencies, the phase structure of frequencies outside the selected range would be preserved, allowing the user to selectively manipulate the structural integrity of the image in different spatial frequency bands. In our current example, the user would choose a low-pass frequency range, specify the cut-off spatial frequency and the extent of phase scrambling, i.e., scrambling index of 0 for no randomization and an index of 1 for maximum randomization (Fig. 6A, top panel). The resulting mask image (Fig. 6A, bottom left panel) would contain intact edges and a randomized spatial layout for the solid areas. Conversely, to preserve solid areas (Fig. 6A, bottom right panel), the user would choose a high-pass frequency range for the phase-scrambling process.

Fig. 6
figure 6

Controlling spatiotemporal content with the CFS-crafter. A The CFS-crafter allows users to phase scramble specific ranges of spatial frequencies. In this example, the user chooses to phase scramble frequencies below 3 cpd (low-pass scrambling). Example images of low-pass and high-pass scrambling are illustrated in the bottom panel. B A user keen on evaluating the modification results can analyze the stimuli with the Analysis interface, which allows the user to input mask sequences (4D matrices in .mat file format) and single image files (e.g., .png). Users can input target and mask images/sequences into the interface and compare the properties of these stimuli. Users are also free to edit their selections, choose the appropriate analysis and save the results as a .mat file. For example, a user interested in edge density selects the Canny method and leaves the threshold entry blank. In response, MATLAB would use the default threshold for the Canny method. C To allow easy visual comparison, results for each type of stimulus are color coded and represented by a letter symbol. Analyses are also automatically matched to the content of the stimulus. For example, despite selecting spatial and temporal frequency analyses in the Analysis interface (B), the CFS-crafter only outputs temporal frequency results for Stimulus B, a greyscale mask sequence. D Users can evaluate the performance of their chosen edge detection method by selecting Preview Edges in the Analysis interface. Here, they can review a sample frame of each mask sequence or individual image. The edge detection parameters and computed edge density values for each type of stimulus are provided in this review page

Our descriptions thus far have presented temporal and spatial manipulations separately, but it is possible to simultaneously customize multiple spatiotemporal properties with the CFS-crafter. This capability does come with some practical limitations built into the program. For example, the option for phase scrambling will be disabled if the user had chosen to conduct orientation filtering, as orientation depends on phase coherence over frequency. In contrast, a user who opted to filter the spatial frequency components of a Mondrian sequence without altering the phase structure would be still able to conduct orientation filtering. Similarly, a user interested in conducting spatial and temporal would be able to do so by selecting the Spatial frequency and Temporal components. As spatial and temporal content are independent parameters, choices in the spatial domain will not affect or disable options within the temporal domain. Users are advised to review the information provided by the help button before manipulating multiple parameters.

The spatiotemporal similarity between target and mask

We have shown how the CFS-crafter can be used to generate compatible spatiotemporal properties between the target and mask. In general, masks (i.e., the CFS animation viewed by one eye) and targets (i.e., the weaker stimulus viewed by the other eye) sharing similar spatiotemporal properties produce more effective CFS suppression, e.g., longer suppression periods, larger target contrast thresholds (Han & Alais, 2018; Moors et al., 2014; Yang & Blake, 2012). However, there is also evidence suggesting that the similarity between target and mask may have little effect on some stimulus types, e.g., movement (Ananyev et al., 2017). The reasons for this discrepancy are unclear, although it is possible that CFS suppression may involve both feature and non-feature selective processes. Depending on the type of stimulus and task requirements, either process may have a larger or smaller effect on CFS task performance. Therefore, it is useful to quantify the spatiotemporal properties of CFS mask and target stimuli, as this would not only indicate the amount of target/mask overlap and potential low-level explanations for CFS results, but also provide information on how CFS suppression works with different types of stimuli.

Returning to our chromatic target example, suppose the user has collected data with two types of chromatic masks and obtained different patterns of results. In addition to comparing participant characteristics and data collection procedures between the two datasets, the user could also evaluate the spatiotemporal properties of the two mask sequences and their respective compatibility with the target. The Analysis interface (Fig. 6B) can be launched by selecting the Analysis button in the homepage. Here, the user can add the directories of the mask and target stimuli, and the user is free to edit their choices or refresh the list by clicking on the Reset button. Note that while mask sequences are required to be in a 4D matrix, target images can be loaded as .jpg, .png or .tiff file. Three main analysis categories are provided in the interface, namely, Basic Information, Spatial/Temporal Frequency and Color. Properties grouped under Basic Information provide a summary description of the stimulus and are presented in table format in the analysis output (Fig. 6C). These statistics include RMS contrast, image entropy, and edge density. RMS contrast is the main image contrast measure in the CFS-crafter, as it does not depend on the spatial distributions of contrast in the image, unlike other indices of contrast such as Michelson contrast (Kukkonen et al., 1993). This independence is well suited for the more complex images typically used in CFS, such as Mondrian patterns or natural object images.

The metrics of image entropy and edge density quantify image texture, and these measures have been linked to CFS suppression durations (Han et al., 2021). By definition, image entropy is the level of noise or randomness in an image’s pixel intensity distribution (Gonzalez et al., 2004). Entropy can distinguish images with uniform pattern elements from more textured images. For example, a white noise image will produce a higher entropy value than a square wave grating, as the latter has a more ordered arrangement of pixel intensity and only two grey levels (black and white). Edge density estimates the prevalence of edges in an image or image sequence. Although high spatial frequencies are often associated with edges in an image, edge density is a more direct measure that can capture the absence of edges in phase scrambled images. A major component of the edge density calculation is edge detection, which is typically achieved by considering the gradient changes in pixel intensity (Muthukrishnan & Radha, 2011). The CFS-crafter provides four different detection methods, of which the Canny edge detection method is default. This is because the Canny method uses two thresholds to detect strong and weak edges, and as a result, it is less susceptible to noise (Muthukrishnan & Radha, 2011). The Approximate Canny method is similar to Canny edge detection, except that it has faster execution times and lower edge detection accuracy. To aid the choice of detection methods, users are advised to review the information buttons before selecting a particular edge detection method. In addition, they can choose to evaluate the results of edge detection with the Preview Edges option (Fig. 6B), which loads a preview page and allows them to save these results as a .mat file (Fig. 6D).

The Spatial/Temporal Frequency analysis extracts the spatial frequency and temporal frequency amplitude spectrums from the input stimuli. Both measures are computed over the entire mask area, because it provides spatial coverage for all possible target sizes and locations (e.g., smaller targets that are presented in randomized locations or targets with the same size as the mask images). Unlike the Modification interface, spatial frequency detail can either be extracted from 4D mask sequences generated by the CFS-crafter or single image files. (Note: all images are converted to greyscale before conducting any spatiotemporal analyses. This increases processing efficiency and does not drastically alter the spatiotemporal results.) This allows the user to compare spatial frequency information between target and masks. As temporal frequency information can only be extracted from mask sequences, temporal analyses will not be conducted for single image files. As illustrated in Fig. 6C, each input stimulus is plotted as a separate line for each parameter, allowing easy comparison among the analyzed stimuli. Users who wish to save the graphical results would place their cursor over each graph and select the save icon (Fig. 6C). One can also save the spectrum data with the Save Results to File button, which produces a structure variable in the .mat file format. The spectrum data can be retrieved from this variable using MATLAB coding norms, e.g., analysis_results.frequency_results.spatial_frequency. Finally, basic color statistics can be computed using the Color option. Selecting this option will output the hue, saturation, and value (HSV) colormap as separate layers (i.e., third dimension) of a 3D matrix. Summary statistics for the stimulus’ HSV content are presented as separate tables (e.g., mean hue) and the distribution of HSV values are presented as separate plots (Fig. 6C). As with the other types of analyses, HSV results can also be saved to a .mat file for further use.

Continuing with the discussion of similarity between mask and target, imagine a scenario where a given chromatic mask sequence contains more low-pass spatial frequency content than does a target that portrays semantic content (e.g., an angry face). The user now faces two possible explanations for the observed results: 1) the high-level, semantic information of the target, or 2) the low-level spatial frequency differences between the target and mask. To evaluate these possibilities, the user can customize their target stimuli to maximize target/mask spatial compatibility. This is performed in the Modification interface, where the target stimulus can be submitted as a single image in typical image file formats (e.g., .jpg, .jpeg, .png), or as a 4D matrix (in the same file and structure format as those created with the CFS-crafter) if the chromatic target is also temporally modulating. Like mask sequences, temporally modulating targets that are not in the appropriate format can be converted into a 4D matrix with the Convert interface. The same practical limitations in customization apply to target stimuli. For example, phase scrambling is not permitted with orientation filtering in the Modification interface and selecting temporal filtering on a single image file would prompt an error message.

Recommendations when using the CFS-crafter

Processing efficiency

The codes supporting the capabilities of the CFS-crafter are programmed to be as streamlined as possible. Nevertheless, it is useful to discuss the performance limits of the application and how to use it more efficiently. Test runs on a Mac OS with Intel Core i7 and an Iris Plus Graphics 655 graphics card required approximately 5 s to conduct a full analysis (i.e., all analyses options selected) on a sequence comprising 120 images (200 by 200 pixels each). A similar period is required to generate a face element mask sequence with the same dimensions. However, we expect processing times to increase with larger image sizes and longer durations, especially since 4D matrices are used to represent mask sequences. For example, designing a 3-min colored mask sequence for a 120-Hz screen would require a total of 21,600 RGB frames. This rate of increase is slower in more efficient operating systems, and users are advised to consider their system specifications when processing large image sequences with the CFS-crafter. If a more efficient system is not available, the user may consider conducting the image processing in chunks or loop a shorter mask sequence during the experiment.

Conclusions and discussion

Continuous flash suppression (CSF) is a highly effective technique for erasing a normally visible target from visual awareness for several seconds at a time. Despite its purported effectiveness, the lack of standard procedures has presented challenges in data interpretation and effective CFS usage. To address these issues, we designed the CFS-crafter, an open-source MATLAB application that allows fine-control manipulation and analyses of CFS stimuli with no prior expertise in image processing. We hope that the increased accessibility to finer stimulus control and analyses would facilitate more effective usage of the technique among novice users and experts and promote more in-depth discussions about CFS mechanisms and findings. Lastly, the app and its source code are made available at GitHub, a publicly accessible, online platform (https://github.com/guandongwang/cfs_crafter), along with a discussion forum to encourage collaborative communication. GitHub users may even use the platform’s pull request to submit modifications or additional functionalities, which we will review and, if approved, incorporate into the CFS-crafter. We hope that the CFS-crafter will be useful to the research community, and that the open-source nature of the CFS-crafter will inspire additional resources for the effective use and study of CFS.