Multimedia ad exposure scale: measuring short-term impact of online ad exposure

The shift toward fragmented and ubiquitous use of multimedia poses several challenges to our understanding and assessment of multimedia exposure and its effects. This article focuses on multimedia advertising exposure and its impact on consumer behavior. It presents the development of Multimedia Ad Exposure Scale (MMAES) – an instrument designed to measure short-term effects of online multimedia ad exposure in terms of engagement, psychological reactance, awareness and attitude, and purchase intention. The main research challenge has been to identify core dimensions that can reliably measure such exposure, particularly in the context of ad-supported video streaming. The development of MMAES is presented through its conceptualization, operationalization, and an observational study conducted via crowdsourcing. The target group is young adults (ages 18-24, N = 360), digital natives who engage with ad-supported video streaming more than any other user group. Exploratory factor analysis revealed a well-defined four-factor structure of MMAES. The results of the validity and reliability measures show good content and construct validity as well as good overall reliability and very good internal consistency of MMAES. Overall, the results show that MMAES is a reliable instrument for measuring the short-term effects of multimedia ad exposure and its weak ground truth.


Introduction
Multimedia is the predominant form of media in the communications and information industry today. As individuals and groups, we are exposed to a variety of information on different media platforms and from different sources, regardless of time and place [4]. According to the most recent statistics, U.S. adults spend nearly half of their day with media, an estimated 11 hours per day, including traditional and digital media, with younger adults (ages [18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34] spending more than half of their daily media use with digital media [41,48]. Time spent on ad-supported content is also growing. Globally, ad-supported content accounts for 66.5% of the average daily media usage of 7.3 hours [41]. In the U.S., ad-supported content accounts for 86% of total media consumption, and consumer's direct exposure to advertising is estimated at over 90 minutes per day, or about 15% of total daily media consumption [41,48]. One multimedia platform where ad-supported content is growing particularly strongly is video streaming, where ad-supported streaming has overtaken video-on-demand streaming [49].
The widespread use of multimedia in daily life means that its use has far-reaching implications for individuals, businesses and industry, and is the most important currency in the world of advertising [4,15]. Despite its ubiquity, measuring multimedia exposure remains a challenge. There is a lack of systematic research evaluating the validity and reliability of various exposure measurements, and little consensus among researchers on how best to measure exposure [4,15,24,36].
An analysis conducted over 200 research studies on media exposure conducted by [15] found that there is still no universally accepted conceptualization and operationalization of media exposure and its effects. In addition, many of the established methods for measuring ad exposure, such as frequency measures (how many users saw a particular ad and how often) and self-reports based on consumer recall, were found to be prone to over-reporting (see Section 2). This problem is even more pressing when measuring online exposure, as traditional exposure measures, while still widely used, are also less relevant. Many aspects of consumer behavior and engagement with online multimedia can now be measured at the individual level, which has led to changes in advertising strategies [4,15]. One such example is the redefinition of reach as a key measure of advertising exposure, shifting from traditional broadcast to the general population via mass communication (through traditional TV, radio or print media) to targeted digital advertising that relies on social media platforms and consumer engagement [11,16,35].
The lack of systematic and reliable methods for measuring online multimedia ad exposure can also be attributed to the increasingly fragmented media landscape, constantly evolving digital platforms, changing consumer behavior, and novel multimedia usage patterns and contexts [4,15,24,35,36].
This article presents the development of Multimedia Ad Exposure Scale (MMAES). MMAES is a psychometric instrument designed to measure short-term effects of online multimedia ad exposure. It aims to overcome several shortcomings of traditional methods by providing an instrument that is appropriate for online multimedia, particularly ad-supported video streaming. The novelty of MMAES lies in the following: a) measuring the short-term effects of online multimedia ad exposure through consumer engagement and attitudes, b) developing dimensions and items that can be directly measured and are suitable for psychophysiological measurements with sensors, and c) a weak ground truth estimation of multimedia ad exposure. In addition, the focus is on digital natives, a growing group of younger consumers who are accustomed to online distractions and can divert their attention from advertising [35]. The latter underpins the need for MMAES as a short-term exposure scale.
MMAES was developed through the steps of conceptualization, operationalization, and observational study using crowdsourcing (N=360). Participants were digital natives (18)(19)(20)(21)(22)(23)(24) years old) who engage with multimedia and ad-supported video streaming more than any other user group [35]. In addition, expert feedback was used throughout the development of the instrument to specify selection criteria for materials and exposure metrics and to verify the validity and reliability of MMAES. In what follows we first review related work on multimedia exposure, especially in the context of online advertising (Section 2). We then present the methodology used to develop MMAES (Section 3). Details of the conceptualization and operationalization of MMAES are presented along with the statistical methods used in the construction, validation and reliability testing of MMAES. This is followed by a presentation of materials and an observational study. Section 4 presents the findings. The exploratory factor analysis conducted on the data from the observational study and the results of validity and reliability testing of MMAES are presented. Section 4 concludes with a correlational analysis comparing selected components of MMAES with an established psychometric instrument. Section 5 discusses the results and limitations of the study, and Section 6 presents conclusions and opportunities for future work.

Related work
Measuring media exposure is critical to understanding media use and its effects [15]. In the literature, media exposure is defined as "the extent to which audience members have encountered specific messages or classes of messages/media content." [46, p. 168]. In the context of multimedia, this means that individuals are exposed to a "seamless integration of data, text, images of all kinds and sound within a single, digital information environment." [21, p. 4].
To this end, several studies have been conducted in the past to measure media exposure, particularly in the areas of advertising and communication effects [36,37], social media, news and politics [33,34], among many others. However, several researchers have pointed out that there is still little consensus on how best to measure exposure and a lack of systematic research addressing the validity and reliability of various exposure measures [15,24,36,37]. The limitation of many empirical studies is that they either do not provide a valid and reliable instrument or the instrument provided is too domain-specific and covers aspects of media exposure that are less relevant or cannot be generalized to other domains. In addition, much research still relies heavily on traditional approaches, in particular frequency measures, self-report surveys that are usually based on recall, and various types of 'linkage studies' that combine survey data and media content [15]. Some of these approaches have been shown to be prone to over-reporting and to have low validity and reliability [15,24].

Measuring online ad exposure
Traditionally, advertising exposure is most often measured by viewability (i.e., simply noting that consumers have seen the branded content) and the associated frequency measures [5]. With advances in technology and advertising in digital media, it is now possible and necessary to understand consumer exposure and engagement in brand-related activities beyond just frequency measurement [4,36,37]. For example, novel approaches to measuring online ad exposure can track consumer browsing behavior [14,34], their social media metrics and trends [16], assess consumer behavior in terms of attitude [1,43], engagement [18,28,37], and purchase intention [17,33,35], as well as measure brand familiarity [4], ad intrusiveness [32,43] and advertising effectiveness [35], among others.
At the same time, the shift to digital poses new challenges to the systematic conceptualization and operationalization of online multimedia ad exposure [4,36,37]. Existing empirical studies provide some level of conceptualization and operationalization of such exposure. For example, Dehghani et al. examined university students' consumption behaviors related to YouTube advertising, specifically advertising value, brand awareness, and purchase intention [17]. They developed a four-factor conceptual model based on 315 questionnaires, which included the following dimensions: entertainment, adaptation, informativeness, and irritation. Their results showed that "perceived entertainment and individualization of advertising are the strongest positive drivers of advertising value, while irritation is the negative driver." [17, p. 177]. [28] developed a 10-item CBE -a consumer-brand engagement scale consisting of three dimensions, including cognitive processing, affection, and activation. CBE is tailored to brand engagement in different social media contexts. Similarly, [18] developed and validated a 22-item consumer brand engagement scale based on qualitative feedback from consumers and experts, tailored to the context of different online brand communities. In their qualitative analysis of brand engagement, [8] argue for contextual approach to measuring brand engagement and provide conceptualization of engagement as a context-specific experience that can vary from brand to brand and product to product.
The above studies provide limited insight into the effects of online ad exposure, as they mostly focus on marketing-specific metrics and understanding the interactions between social media, brand engagement, advertising effectiveness, and purchase intention (e.g., [8,17,18,28,33,35]). In addition, several studies [8,17,35] use traditional methods, both qualitative and quantitative (such as interviews or self-reports), that are not optimized for measuring online ad exposure. In cases where psychometric instruments have been developed specifically to measure online ad exposure, the scales developed are often too narrow in context or scope. This is the case with the two scales developed by [18,28], as both focus primarily on the interactions between brand engagement and social media.
The aim of the following sections is to provide a comprehensive overview of the measurement of online multimedia ad exposure and its effects, and to introduce some key aspects that will later be used in the development of MMAES.

Engagement
The ubiquity of multimedia content and advances in new technologies have changed consumer behavior. To reach consumers in an oversaturated digital media landscape, brands are using innovative tactics to create memorable impressions and positive attitudes to increase engagement, personalization, flow experiences, and social influence, among other things [52].
Engagement is understood by Higgins and Scholer as a psychological process, "a state of being involved, occupied, fully absorbed, or engrossed in something." [26, p. 102]. Some operationalize engagement further in terms of behavioral manifestations in consumer-brand relationships [28]. Still others view engagement in terms of brand messages that elicit passionate and effective responses from consumers, or as enabling co-creative experiences [39]. Consumer engagement is a multidimensional concept that explains cognitive, psychological, and behavioral components of consumer-brand interaction [4,18,25,28,34,38,40]. It is central to digital advertising and a main goal of marketing [4,8]. The main goal is "fostering engagement through meaningful and sustained interactions with consumers" [4, p. 428], because higher consumer engagement leads to more exposure, improves attention and attitude, and potentially has a positive impact on consumer purchase intentions [4,8,18]. Therefore, cognitive, affective and social dimensions of engagement are increasingly considered [4,8]. Engagement is key to the consumer-brand relationship and central to understanding advertising exposure and its effects on consumer behavior, in addition to brand awareness and recall [44,45], reactance [19,43], attitudes [37], and purchase intention [4,33].

Reactance
While promoting engagement can have positive effects in terms of improved attitudes, memorability, and increased purchase intent, there are also potential negative effects. A variety of marketing strategies use different types of persuasive pressure to increase engagement and persuade users to take a particular action or purchase a product [43]. According to various studies, increased exposure to ads that are too assertive, explicit, or intrusive can have negative effects and lead to psychological reactance [19,20,32,43].
Psychological reactance introduced by Brehm [7], is a reaction to a perceived threat or loss of behavioral freedom and is defined as "a motivational state characterized by distress, anxiety, resistance, and the desire to restore that freedom" [50]. In the context of consumer behavior, reactance plays an important role. A high level of pressure may cause a consumer to perceive an advertisement as a threat to his or her personal freedom, leading to reactance and reactance-related effects, such as negative cognitions and attitudes, negative affect (anger), undesirable decline in brand or product acceptance, and lower compliance [19,20,32,43]. To this end, [43] developed a conceptual framework for advertising intrusiveness that examines both the causes and effects of consumer reactance. They identified temporal, visual, and flow disruptions as the three main causes of reactance. For each interruption, they identified pairs of main subcomponents: time-wasting and forced interruption (e.g., an advertisement that is forced on the consumer without their consent) as the main subcomponents of temporal interruption, intrusive and distracting advertisements as the main subcomponents of visual interruption, and interruptive advertisements (advertisements that interrupt a consumer's task) and irrelevant advertisements as the main subcomponents of flow interruption [43].
These interruptions lead to three types of responses to advertising exposure: emotional, behavioral, and cognitive [43]. The emotional response leads to frustration and anger, the behavioral (physical) response leads to avoidance behavior and interruption of activities (e.g., interrupting an ongoing task), and the cognitive response leads to lack of attention to the advertisement and consequently low explicit recall of the advertisement [43]. In general, reactance can lead to negative attitudes and lower recall, as well as have negative effects on consumers' purchase intentions with respect to the advertised brand [17,20,32,43].

Attitude, awareness, memory, and purchase intention
There are other important aspects to consumer engagement in online advertising. Attitude, relevance, credibility, and likability intertwine with recall-related aspects, such as memory, memorability, familiarity, and awareness of a brand [5,51].
As several studies show, consumers are more likely to remember advertising information for a brand they are familiar with [45,47]. Brand credibility and likability also affect attitudes toward an ad and, in conjunction with personal relevance, purchase intention [47]. Both personal relevance and brand awareness are strong predictors of attention, with ads that are relevant or from well-known brands receiving higher attention [31,47]. Perceived usefulness and enjoyment, as well as social influence, also predict engagement and attitude toward the brand [13].
Creating memorable advertising impressions is important for advertising effectiveness. Memorable advertising impressions can better influence consumer behavior and elicit attitudes with positive emotional and cognitive responses toward an ad or brand [32,45,53,54]. Emotional reactions include actions based on emotions that an ad or brand evokes in a consumer [2,43]. Cognitive reactions include conscious responses of the consumer based on reasoning [43], such as the perceived personal relevance of an ad or brand. In addition, positive emotional responses and creativity have a positive impact on advertising recall and higher recall accuracy and can positively influence consumers' purchase intention [45,47]. The memorability of creative ads is improved because creative ads are "better recognized and remembered than their less creative counterparts" [45, p. 313].
Interactions that consumers have with brands improve brand attitudes and brand awareness and also have positive effects on brand memory. For example, an experiment by [13] examined the effect of brand interactivity from online advertising on brand memory, purchase intention, and brand attitude among teenagers (11-14 years old, N = 576). Results show that brand interactivity increases both brand attitude and brand memory [13]. A previously mentioned study by [17] on the advertising effectiveness and social media showed that entertainment, informativeness, and individualization are the strongest positive factors of YouTube advertising and are associated with purchase intention. In an experiment based on psychographic profiles of consumer digital traces (N = 936), [52] examined the effects of trait-based personalization in social media advertising and the conditions under which such personalization is effective. While their results show that personalized advertising leads to higher engagement, the effects of personality on consumer attitudes toward advertised products were limited [52]. Study conducted by [35] focused on two young consumer groups (Millennials and Generation Z age cohorts) to examine how effective different digital advertising strategies are for these digital natives. The results show that these growing groups of younger consumers are accustomed to online distractions and can divert their attention from ads, impacting memorability. Their results also show that short digital ads combined with social media have a positive impact on advertising effectiveness [35].
To our knowledge, and based on the review of the current state-of-the-art, none of the existing research provides a psychometric instrument suitable for measuring the short-term effects of online multimedia ad exposure. Also, none of the existing approaches systematically consider all of the above aspects of consumer behavior related to online multimedia ad exposure. Moreover, the ability of digital natives to effortlessly divert attention from online ads (as reported by [35]) underpins the need to measure short-term ad exposure. We therefore build on the empirical findings of the related work, developing MMAES as a psychometric instrument based on the following key aspects of online multimedia ad exposure: engagement, psychological reactance, attitude, awareness and memory, and purchase intention. The following sections present the research methods used in the development of MMAES.

Methods
The goal of the proposed MMAES instrument is to measure short-term exposure to online multimedia ads. The following sections describe the research methodology that was used to develop MMAES. Conceptualization and operationalization steps were taken to identify several components (dimensions) and items relevant to short-term exposure to multimedia ads. This initial set of multimedia ad exposure items will later be used in the observational study (Section 3.4) to collect participant responses and develop MMAES.

Conceptualization and operationalization
Multimedia ad exposure is a multidimensional construct consisting of several underlying dimensions. The starting point for the conceptualization of MMAES were reviews of existing studies and discussions with media and marketing experts (The Nielsen Company) about which dimensions are relevant for measuring online multimedia ad exposure. Several important aspects and their impact on consumer behavior were considered to specify the underlying construct of short-term online multimedia ad exposure: engagement, attitude, brand awareness and familiarity, memory and memorability, and purchase intention. These discussions led to the selection criteria for participants and materials used in the observational study (Section 3.3).
Operationalization of MMAES involved developing the items to measure the latent dimensions identified in the conceptualization phase using standard statistical procedures (Section 3.2). It drew heavily on the empirical findings from the literature, adapting several relevant components (dimensions) and items from the existing validated instruments (described in Sections 2.2, 2.3 and 2.4), as well as developing custom items based on expert feedback. The main selection criteria were to find the items with an appropriate conceptual structure that best represented their respective components and supported the overall goal of MMAES -to provide psychometric and short-term measurement of multimedia ad exposure. The following sections provide further details on the relevant aspects of online multimedia and exposure and item sets used in operationalizing MMEAS. Engagement A growing consensus has emerged around the concept of consumer engagement as a multidimensional concept that encompasses the cognitive, emotional, behavioral, and social components of consumer-brand interaction [4,8]. The items representing the engagement component were adapted from two relevant, validated and widely used scales that together cover all four dimensions (cognitive, emotional, behavioral and social) of consumer engagement. The behavioral and social dimensions of engagement were adapted from the User Engagement Scale Short Form (UES-SF) [39], while the cognitive and emotional dimensions were adapted from the [25]. In total, the engagement component is represented by 19 items. The set of items representing the behavioral and social dimensions includes 12 items adapted from the UES-SF [39]. An example item from the set: "I was absorbed in this advertisement". The items representing the cognitive and emotional dimensions of engagement consist of seven questions adapted from the engagement scale of [25]. An example item from this set: "Describe your mood during watching this Coca Cola commercial in terms of your excitement:". Reactance Reactance is represented by eight items adapted from the Reactance Scale for Human-Computer Interaction (RSHCI) [20]. An example item from this set: "This advertisement frustrated me". -cognitive, emotional 7 [25] Reactance 8 [20] Attitude, Purchase intention 20 [6] Awareness Purchase Intention and Attitude Purchase intention and attitude, along with various other aspects of advertising exposure, are represented by 20 items adapted from [6]. According to [6], these items cover the four steps of users' purchase decision process (problem recognition, information search, information evaluation and decision). An example of purchase intention item: "Would you buy this product?".

Awareness and Memory
Awareness and memory are represented by eight items, seven of which are adapted from [15] and one from [30]. These items focus on brand awareness and recall in relation to unaided recall, aided recall, demonstrated recall, and recognition [15]. An example item for the recall aspect: "What was the last advertisement about?" [30].
Other questions related to demographics, user experience with technology, habits, and mood were also included, 10 in total. Examples from this set include: "What is your overall experience with technology?", "How many days a week do you use video streaming?", "How do you feel at the moment?". In addition, several questions were used as control questions to eliminate inattentive and unreliable participants (see Section 3.4.2 for details).
The initial set of items representing various aspects of online multimedia ad exposure included 55 items, as shown in Table 1. This version was later used in the observational study (Section 3.4) to obtain participants' responses.

Measurements
In operationalizing and developing MMAES, various statistical measurements were made. These are presented along with the final version of MMAES in Section 4. First, an exploratory factor analysis (EFA) was conducted on the data obtained from the participants' responses to the initial set of multimedia ad exposure items used in the observational study. The EFA was conducted to identify the factors underlying MMAES and to develop the final version of MMEAS. After the EFA, the identified factors (dimensions) and items were rearranged according to their conceptual structure and in relation to the respective dimensions. The dimensions were named in a way that best represents their underlying conceptual structure. Then, the validity and reliability of MMAES and its psychometric properties were tested using several statistics. Content validity, convergent validity, and discriminant validity were conducted along with several measures that assessed the reliability of the instrument, the internal consistency (congeneric reliability of the MMAES components), and the composite reliability of MMAES. Finally, a correlational analysis was conducted to examine the correlation between selected MMAES components and items with those of the established instrument. Data processing and analysis, as well as validity and reliability measurements, were conducted in R using the psych library (functions cor and omega) [42] and Python.

Materials
The development of a valid and reliable psychometric instrument requires the specification of clear selection criteria to limit potential sources of uncontrolled variance, such as age and usage-related effects. Materials were selected in collaboration with three media and marketing experts from The Nielsen Company. The goal was to simulate the multimedia content of ad-supported video streaming, which is based on in-video video ads. All materials were in English and broadcasted in the U.S.
The materials consisted of main multimedia content (videos with short movie clips) and video ads from an online streaming service YouTube. The materials were then categorized according to the following selection criteria: view index (low vs. high), engagement level (lower vs. higher), brand familiarity (known vs. unknown), and product novelty (daily use vs. novel product).
Videos (main multimedia content) were categorized by the view index and engagement level. The view index was based on the number of views for a selected YouTube video, with a threshold set at one million views. Videos above this threshold were categorized as having a high view index and videos below this threshold were categorized as having a low view index. The threshold of one million views is an approximate estimate of the average number of views for a given content category of videos (see also [23]) based on expert feedback and selection criteria. The engagement level indicates whether a video has higher or lower engagement. The engagement level for the videos was defined by the three media experts 1 .
Ads were categorized by view index (low vs. high), brand familiarity (known vs. unknown brand), and product novelty (daily use vs. novel product). The specification of the view index was the same as for the videos. A local repair shop is an example of unknown brand, while a famous perfume is an example of a well-known brand. An example of a daily use product is one used daily basis (e.g., a toothbrush), while a novel product can be revolutionary, innovative, exotic, or creative (e.g., an anti-theft device for cars).

Selected materials
The final set of materials consisted of several contrast combinations of videos and video ads, as shown in Table 2. The video ads were integrated into the main content videos to simulate the in-video advertising in a realistic setting. These materials were then used in the observational study (next section).
Contrast combinations were created based on view index vs. engagement level for the videos and view index vs. familiarity/novelty for the ads. Two groups of videos were created based on their respective view indices (low vs. high ), with three videos with different engagement levels (lower vs. higher) assigned to each group. Two low engagement videos (Deaf and Rohan) and one high engagement video (Mixtape) were assigned to the group with low view index, and one low engagement video (72 kg) and two high engagement videos (I miss you and Modern Family) were assigned to the group with a high view index, as shown in Table 2.
A similar procedure was used for the ads. Ads were placed in the main videos based on randomly selected time intervals. Two groups of ads were selected based on their respective view indices (low vs. high). Note that brand awareness correlates with view index, as ads for well-known brands tend to have a higher view index than ads for unknown brands. For each set, one ad for a novel product and two ads for a well-known (daily use) product were randomly assigned. Dior Perfume, Little Baby's Ice Cream and Coca Cola are examples of well-known brands with high view index, while Willamette Collision Center, Ravelco Anti-Theft device and Waring Ice Cream Machine are relatively unknown brands with a low view index. Ravelco Anti-Theft device and Little Baby's Ice Cream were categorized as novel.

Observational study
The observational study was conducted to collect participants' responses to the initial set of 55 multimedia ad exposure items (refer to Table 1 and Section 3.1). To obtain a sufficient number of participants, an online study consisting of watching a video including an ad and a survey was conducted via the crowdsourcing platform Clickworker. 2 Participants were rewarded with e4.50 for completing the study.

Participants
The target user group for the observational study were young adults (ages 18-24) who are native English speakers living in the U.S. This age group represents the largest segment of digital natives who engage with online multimedia, particularly ad-supported video streaming. The focus was on short-term exposure to in-video ads, which is common in adsupported video streaming [10]. To control for technology-related effects of multimedia exposure (e.g., screen size, technology-related usage behavior), the participants were asked to complete the study on their personal computers rather than mobile devices. These constraints allowed us to limit potential sources of uncontrolled variance (such as age-, usage-and technology-related effects) in the measurement of online multimedia ad exposure.

Procedure
First, informed consent was obtained, informing the participants of the purpose of the study and the precautions taken to protect their privacy and maintain the confidentiality of the data. The participants were then given a brief description of the purpose and duration of the study (less than 15 minutes). The participants were informed that no time restrictions were imposed on them, and they were instructed to set an appropriate volume on their computer to fully experience the multimedia content. They were also instructed to pay attention to the multimedia content, and presented with a general attitude to adopt during the survey: It is the end of a busy week and they are at home watching videos in a relaxed manner when they come across a short video including an ad. The participants were also informed that a control number would be presented at a certain point in the video and that they would have to remember this number as an answer to the control question that would be asked later. This control question was used to assess participants attention to the content of the video.
The participants were informed about the procedure and the three phases of the study: 1. Pre-questionnaire. Participants were asked to answer five questions about demographics, technology experience, and mood assessment (Table 1); 2. Multimedia phase. Watching one of the video/ad combinations from Table 2. The combinations were randomly assigned to the participants, with only one combination assigned to each participant. 3. Main survey. In this final phase, participants' responses were recorded in relation to the multimedia combination assigned to them in the previous phase. The initial set of 55 ad exposure items was used (Table 1), with the participant's responses recorded on a Likert scale for each item. These responses were later used in the exploratory factor analysis (Section 4.1).
In the post-survey phase, the participants were asked about their habits, asked to assess their current mood, and to share their overall impression of the entire survey in writing by answering the question "Please comment on your overall impression!".
In addition, a series of control questions on content, consistency, and attention (gold standard data [30]) were asked to detect and eliminate inattentive or unreliable participants. Toward the end of the video, a control number was displayed in pink in the center of the screen for four seconds. The control question was "What number was shown in the video clip?". The content question was "What was the commercial about?". The consistency question asked for the participant's age to detect participants who had provided misleading information in their profiles (or had someone else conduct the study) and were not detected and removed by Clickworker by default.

Results
A total of 360 participants (62.9% female) took part in the study, which lasted nine days. For each of the six multimedia combinations the responses from 60 participants were collected. A total of 98.6% of participants were within the specified age range. Eleven participants were excluded, of whom five (1.4%) were older than 24 years and six participants failed the control tests. In total, data from 349 participants were used for further analysis.
The following section presents the results of the exploratory factor analysis for the data obtained from the observational study. The final version of MMAES is then presented, including the structure of its components and the factor loadings of the items. Several statistical tests are conducted to evaluate the validity and reliability of MMAES. Content validity, convergent validity, and discriminant validity are presented, as well as several measures to assess the reliability, internal consistency (congeneric reliability), and composite reliability of MMAES. Finally, a correlational analysis is presented comparing selected dimensions of MMAES with related constructs from the established instrument User Engagement Scale [39].

Explanatory factor analysis
Exploratory factor analysis (EFA) was conducted to analyze the participants' responses (participants N = 349) to the initial set of 55 items representing various aspects of online multimedia ad exposure (Section 3.1). The EFA included 43 items out of total of 65. The remaining 22 items contained textual feedback, binary, or categorical responses that were not relevant to the exposure measure and were removed from the analysis.
The Kaiser-Meyer-Olkin (KMO) test and Bartlett's test for sphericity were performed to measure the adequacy of the data for EFA. The test yielded a KMO value of 0.94, indicating very good sampling adequacy. Bartlett's test yielded a low p-value of p < 0.01, indicating that the data are suitable for dimensionality reduction such as EFA.
Spearman correlation was used to measure correlations between responses (ordinal data). The relationship between item responses is likely to be monotonic and may or may not be linear. The Spearman correlation matrix was entered into the omega function (package psych in R). The default minimum residual (Ordinary Least Squares -OLS) factoring method was used, denoted fm = "minres", along with the default rotation method "oblimin".
The EFA identified several relevant factors with their respective items. As shown in Fig. 1, the estimated dimensionality of the model ranged from four to six factors. However, the items for the fifth and sixth factors did not yield meaningful components. The selection criteria for the final number of factors were based on: a) the number of latent factors accounting for most of the variance, using Kaiser's rule (keeping factors with eigenvalues

MMAES
The final version of MMAES consists of four components and 31 items (see Table 4). The decision to use a four-factor structure was based on the selection criteria and the conceptual structure of the items representing each factor. The MMAES components encompass several relevant aspects of online multimedia ad exposure: Ad Engagement (AE), Reactance (RE), Awareness and Attitude (AA), and Purchase Intention (PI). The definitions for the components are presented in Table 3. The structure of MMAES with the correlations between the components and the factor loadings for the items is shown in Fig. 2 and in more detail in Table 4.  A total of 12 items were removed after the EFA due to low commonality, low factor loadings (less than 0.35), or low conceptual relevance to the respective component. Of the ad engagement component, four items were eliminated in the final model due to low factor loadings or low conceptual relevance. Two of the seven items covering cognitive and emotional engagement (adapted from [25]) were eliminated due to low loadings and conceptual overlap with items from the UES-SF [39]. Two additional items covering behavioral and social dimensions of engagement (adapted from the UES-SF) were also removed. One of the removed items was adapted from the UES-SF Focused Attention (FA) subscale, "The time I spent for watching this advertisement just slipped away", while the other item was from the UES-SF Perceived Usability (PU) subscale, "This advertisement was taxing (mentally demanding)".
Several items taken from the UES-SF PU subscale showed high factor loadings in MMAES Reactance (RE) component. These items represent a negative emotional experience in relation to an ad, which is consistent with the definition of the MMAES RE component. The remaining seven PU items were loaded into the MMAES AE component as expected.
The EFA results are consistent with the MMAES RE component and the items adapted from [20]. Only one item, "I don't even want this product as a present", was removed from the RE component due to low loading. The items of the MMAES Awareness and Attitude (AA) component were distributed based on their conceptual structure and factor loadings, as shown in Table 4.
In what follows the results of validity and reliability testing of MMAES are presented. Because MMAES is a novel instrument that measures complex construct with multiple latent factors, a comprehensive reliability assessment was required.

Validity of MMAES
Several methods were used to assess the validity of MMAES. Content validity was evaluated in collaboration with the media experts, in accordance with the guidelines proposed by [27]. Convergent validity and discriminant validity were also assessed (see Section 4.3.2).

Content validity
Content validity is a subjective assessment of how well MMAES items cover relevant aspects of ad exposure. Five multimedia experts were asked to evaluate the psychometric properties of MMAES by assigning items to the four components (factors) based on their professional judgment. The assessment was conducted online. The experts were informed about the aim and procedure of the assessment. They were provided with a list of MMAES components and their definitions (see Table 3) and a list of the items (listed in Table 4, but without the factor loadings and the assignments of the items to the components). The assessment consisted of four pages, one for each component, with all items listed on each page. The experts were asked to rate the extent to which each item represented each component on a five-point rating scale (1 -not at all, 5 -very well).
Q-correlations were calculated between the expert ratings and the existing MMAES component structure. First, the expert ratings of the items for each component were averaged. Items with a rating of 4 or more were classified as representative of that component. These scores were then q-correlated with the MMAES components. The results of the content validity measurement are presented in Table 5. All q-correlation coefficients show a strong positive correlation (above 0.7), demonstrating very good content validity of MMAES (as defined by [27]).

Convergent and discriminant validity
Construct validity was assessed using convergent and discriminant validity, based on the measure of average variance extracted (AVE). Convergent validity tests the extent to which two measures of a construct are related [9]. In our case, convergent validity measures how close the loaded items are to their respective components. As shown in Table 6, the AVEs calculated for each component are all above 0.5, indicating sufficient convergent validity. The AVEs for MMAES components are all above 0.5, indicating sufficient convergent validity Discriminant validity tests whether two measures that should be unrelated are in fact unrelated [22]. In our case, discriminant validity measures the relatedness between components (subscales). We compare the square roots of AVE with the maximum values of the absolute correlation of each component with the other components (see Table 7). All square roots of AVE are above the maximum absolute correlation for each component with respect to the other three MMAES components, indicating sufficient discriminant validity.

Reliability of MMAES
Several measures were used to evaluate the reliability of MMAES: overall reliability, congeneric reliability, composite reliability, instrument reliability, and saturation estimation. According to [12], the incorrect use of reliability measures (such as Cronbach's α) is often due to the fact that the underlying assumptions of the measurement model (such as tau equivalence) have not been previously verified. Using the Chi-square difference test (p < 0.01), we excluded the more restricted measurement models (e.g., parallel, tau-equivalent) and chose a unidimensional, congeneric measurement model to assess the reliability of MMAES.
Overall reliability The multidimensional bi-factor measurement model was used to measure the overall reliability of MMAES [12]. The estimated bi-factor reliability coefficient for MMAES is ρ BF = 0.51, indicating that the model captures approximately 51% of the variance in the entire dataset.
For better comparability with other studies, internal consistency was further checked by using Cronbach's alpha as the lower bound of the reliability coefficient (despite the fact that the data are not unidimensional). The obtained Cronbach's alpha for all items of the EFA is α = 0.82, indicating good internal consistency. The Cronbach's alpha for the final set of items loaded into MMAES is α = 0.83, also indicating good internal consistency.

Congeneric reliability
The reliability of the MMAES components was assessed using the congeneric measurement model. The results are shown in Table 8. All coefficients are above 0.7, indicating good internal consistency. In addition, the remaining components, with the exception of RE, are above 0.9, indicating excellent internal consistency.
As an alternative to congeneric reliability, McDonald's ω t was also measured. The results are shown in Table 8 along with Chronbach's alpha as the lower bound of the reliability coefficient for the congeneric measurement model for each component.

Composite reliability
It was estimated to evaluate the joint variance of the items for each MMAES component [22]. The results are presented in Table 8. All composite reliability Saturation To gain insight into the saturation of the MMAES components, the omega hierarchical ω h was estimated. The omega hierarchical measures part of the variance for loaded items of each component. Table 8 shows the results. The coefficient omega hierarchical (ω h ) shows good model saturation. Note that saturation was measured separately for each component and not as a one-dimensional general factor.

Correlational analysis: relation to other scales
A correlational analysis was conducted to compare selected MMAES components with selected related constructs from an already established and validated instrument. The goal was to investigate whether conceptually related constructs from two different, but related, instruments exhibit a sufficient degree of correlation.
To this end, the correlation between selected components of MMAES and User Engagement Scale -Short Form (UES-SF) [39] was examined. Three MMAES components were compared to the UES-SF: Ad Engagement (AE), Reactance (RE), and Awareness and Attitude (AA). Purchase Intention (PI) was not included in the comparison because it is less relevant in the context of user engagement. The three MMAES components were compared to the following components of the UES-SF: UES-FA (Focused Attention), UES-PU (Perceived Usability), UES-AA (Aesthetic Appeal), and UES-RW (Reward) [39]. 3 The correlations are shown in Table 9. There is a strong positive correlation (above 0.7) between MMAES AE and the components UES-FA, UES-AA, and UES-RW. This was to be expected since several items representing the component MMAES AE were taken from the UES-SF (see Table 1). There is also a strong negative correlation between MMAES RE component and UES-PU. This is consistent with previous research and with the definitions of both components, as reactance represents a negative reaction in terms of acceptance or compliance, while the perceived usability component measures ease of use.
Other correlations show small to moderate effects.

General observations
MMAES aims to fill a gap in measuring the short-term effects of online multimedia ad exposure. To our knowledge, and based on an extensive literature review, none of the existing approaches provide a psychometric instrument suitable for measuring such exposure. As outlined in Section 2.1, the existing empirical studies either consider a narrower subset of ad exposure and its effects, which is typically marketing-oriented, or the psychometric instrument provided is not applicable in the context of short-term multimedia ad exposure. Furthermore, the ability of digital natives to effortlessly divert their attention from online ads (as reported by [35]) underscores the need for MMAES as an instrument to measure shortterm ad exposure. MMAES measures the short-term effects of online ad exposure, taking into account both positive and negative effects, as well as consumers' attitudes toward the brand and their purchase intentions (Section 4.2).
The results show that MMAES is valid, reliable, and internally consistent. The items of the respective components are strongly loaded, with the reported AVE of over 0.5 indicating sufficient convergent validity. The discriminant validity of the individual components (subscales) was also judged to be adequate, with the square roots of AVE above the maximum absolute correlation for each subscale relative to the other three subscales. Our targeted reliability assessment was based on the unidimensional congeneric measurement model of the components (subscales). The proposed model captures approximately 51% of variance for the entire set of items included in the EFA, estimated by the bifactorial reliability coefficient ρ BF = 0.51, which is sufficient. The congeneric reliability coefficients for the individual components were estimated to be quite high, with three of the components above 0.9 and one (reactance) above 0.7. The Omega hierarchical ω h indicates very good reliability. All composite reliability coefficients are above 0.9, and the saturation of the subscales is also very good (above 0.75). The Cronbach's alpha is reported as another measure of internal consistency to allow comparison with the other measures. Cronbach's alpha for the items loaded into MMAES is quite high (α = 0.83), indicating very good internal consistency.
In a correlational analysis, relevant components and items of the MMAES were compared with those of an already established psychometric instrument to determine how well the proposed instrument captures certain aspects of the ad exposure. MMAES was compared with the User Engagement Scale Short Form (UES-SF) [39]. The analysis showed a strong positive correlation (over 0.7) between the engagement component MMAES AE and the engagement subscales of UES-SF: focused attention (UES-FA), aesthetic appeal (UES-AA), and UES-RW (see Table 9).
The correlational analysis also showed a strong negative correlation between the component MMAES-RE and UES-PU. This is an important result that supports the role of MMAES-RE as a separate component for measuring psychological reactance, rather than just a set of negative aspects under the engagement component. It is also consistent with the results of previous research (Section 2.3). The reactance component represents a set of negative reactions related to acceptance or compliance, while the perceived usability component measures ease of use. The reactance component is also negatively correlated with all other three MMAES components, as shown in Fig. 2). On the other hand, and consistent with the empirical results (Section 2.4), positive engagement and attitude toward an advertised product or brand, as well as purchase intention, are positively correlated.

Applicability of MMAES
MMAES is a multidimensional psychometric instrument (described in Section 4.2). It was developed specifically for online collection of participant responses and is based on itemrating scale combinations that can be easily implemented using modern web technologies. It can be used as a whole or in part by selecting one or more components relevant to a particular use case. As a whole, MMAES takes less than 15 minutes to complete. For studies with multiple iterations, MMAES should be performed separately for each iteration or task. In such cases, and when only certain aspects of online ad exposure are the focus, selecting only relevant components is preferable to maintain participant attention and reduce fatigue.
Recommendations for administering and scoring MMAES can be found in Appendix. It is important to note that the MMAES scoring is not an absolute standard. Rather, what constitutes a high or low level for a particular component (e.g., a high level of engagement) is context-and task-dependent. It should always be interpreted by the researchers conducting the study in terms of what is relevant in their case and in comparison to benchmarks in their field.
To summarize the key strengths of MMAES: • is easily implemented with modern web technologies; • tailor-made for ad-supported video streaming; • fast and flexible: it can be used in whole or in part, depending on context and task at hand; • is easy to administer, evaluate and interpret (simple scoring of the components, partially or as a whole); • supports repeated measures design, as well as within-subject and between-task evaluation; • enables quantitative profiling of a single ad or a group of ads; • is suitable for psychophysiological measurements with sensors (especially the psychometric properties of the components MMAES AE and RE); • provides a weak ground truth estimate of short-term ad exposure that can be compared (in part or as a whole) with relevant established scales.

Shortcomings
The generalizability of MMAES remains to be tested. The focus of MMAES on short-term multimedia ad exposure also has its drawbacks. It was developed with certain constraints and focuses on digital natives and ad-supported video streaming. It may not be as effective for other user groups, other types of multimedia ads, in other areas of multimedia, and for other advertising strategies. In addition, due to its design, MMAES is not best suited to measure the long-term effects of ad exposure, which includes other relevant aspects such as user habits [3], long-term advertising effectiveness, and potential decay of the effects of advertising exposure (e.g., memory loss) [47], among others. Further analyzes are needed to assess the robustness and generalizability of MMAES in new contexts.

Conclusions
The article presented the development of MMAES -a psychometric instrument to measure the short-term effects of online multimedia ad exposure through consumer engagement, psychological reactance, awareness and attitude, and purchase intention. MMAES aims to overcome several shortcomings of existing methods and scales. Existing approaches provide limited insight into the effects of online ad exposure, focusing predominantly on marketing-specific metrics and understanding the interactions between social media, brand engagement, advertising effectiveness, and purchase intention. In addition, several approaches still use traditional methods that are not optimized for measuring online ad exposure. In cases where psychometric instruments have been developed specifically for measuring online ad exposure, the scales developed are often too narrow in context or scope.
The presented research focused on identifying key components and items for MMAES and evaluating its psychometric properties by conducting an exploratory factor analysis and multiple validity and reliability measurements. The results show that MMAES is a valid and reliable psychometric instrument for measuring online multimedia ad exposure, especially for ad-supported video streaming. It targets a growing group of younger consumers who are accustomed to online distractions and may divert their attention from the ads.
The novelty of MMAES is in measuring the short-term effects of such an exposure. It also provides a ground truth estimate for developing alternative approaches to measuring online ad exposure, including analysis of social network data and psychophysiological sensor-based methods. In addition, it enables experimental quantitative profiling of online advertising in different contexts (such as YouTube and other ad-streaming services) and for different consumer groups.
Future research will focus on evaluating the robustness and generalizability of MMAES and further refining its psychometric structure. To this end, we plan to conduct several tests of MMAES in new contexts and settings (e.g., mobile) and with different consumer groups. Confirmatory factor analysis will be conducted along with other external validity measures to assess the generalizability of the instrument.

Appendix: MMAES administration and scoring
Administration The following instructions are recommended for the administration of MMAES. Items should be rated using the five-point rating scale shown in Table 10. The MMAES items are listed in Table 11. It is recommended that component IDs not be provided when administering the instrument, but that only the items be visible to participants. The wording of the questions (especially the underlined words) can be adapted to the specific use case. For example, the instructions to participants can be worded as follows, "The following questions aim to rate your experience of watching the advertisement. For each question, please use the following scale and indicate which most closely applies to you." 1. Scores should be calculated as the mean for each component and for each participant (calculate the average of the summed scores for the items in each component). 2. A total MMAES score per participant should be calculated by summing the mean scores of the components. 3. In cases where participants completed the MMAES more than once in the same experiment, such as in a repeated measures design, calculate separate scores for each participant for each iteration/task. This allows for evaluation of exposure within participants and between tasks.