Using a virtual reality interview simulator to explore factors influencing people's behavior

Virtual reality interview simulator (VRIS) provides an effective and manageable approach for candidates prone to being very nervous during interviews, yet, the major anxiety-inducing elements remain unknown. During an interview, the anxiety levels, overall experience, and performance of interviewees might be affected by various circumstances. By analyzing electrodermal activity and questionnaire, we investigated the influence of five variables: (I) \textit{Realism}; (II) \textit{Question type}; (III) \textit{Interviewer attitude}; (IV) \textit{Timing}; and (V) \textit{Preparation}. As such, an orthogonal design $L_8(4^1 \times 2^4)$ with eight experiments ($O A_8$ matrix) was implemented, in which 19 college students took part in the experiments. Considering the anxiety, overall experience, and performance of the interviewees, results indicate that \textit{Question type} plays a major role; secondly, \textit{Realism}, \textit{Preparation}, and \textit{Interviewer attitude} all have some degree of influence; lastly, \textit{Timing} have little to no impact. Specifically, professional interview questions elicited a greater degree of anxiety than personal ones among the categories of interview questions. This work contributes to our understanding of anxiety-stimulating factors during job interviews in virtual reality and provides cues for designing future VRIS.


INTRODUCTION
The global pandemic in recent years has aggravated the situation where many companies are hiring fewer employees, which has led to a series of anxieties among college students [48], one of which is interview anxiety disorder (IAD)-anxiety manifests itself in the form of speech disturbances, socially inappropriate behaviors as well as other nervous jitters [44].Furthermore, past research has shown that as interview anxiety increases, people tend to develop more protective self-presentational tactics, making it difficult for interviewees to perform well and decreasing competitiveness [19].Virtual reality has become a common tool in the therapeutic field, such as substance [3], high-functioning autism [14,39] and eating disorders [12].Virtual reality exposure therapy (VRET) is an important treatment for various anxiety disorders.By mimicking social circumstances for social distress patients [16] and using computer-generated virtual scenes [38], patients can be exposed to an environment with virtual social situations and interactions that target diverse social fears in a controllable way.For example, one study indicated the utility of VR in inducing stressful reactions through a combination of stressors for healthy subjects [40].However, the factors that actually influence an interviewee's anxiety, overall experience, and interview performance in a VRET interview are still unclear and poorly understood, especially when multiple factors are assembled in one system; thus, our research serves as a pioneer for such an investigation.
So far, researchers have also conducted extensive research on systems for VR interview training and evaluation.For example, a virtual job interviewing practice system has been designed for highanxiety populations like people with Autism Spectrum Disorder and former convicts [28], a VR-based job interview training platform has been developed for autistic individuals to practice interviewing skills in a less anxiety-inducing virtual context [2], an agent-based VR training and multidimensional evaluation system has been created for introverted college students to cope with interview anxiety [36], a job training simulation environment has been presented for young people who are out of employment, education, or training with social cue recognition techniques [22].
Nevertheless, most studies today validate and evaluate an entire product without examining the multiple factors separately.Not only do we know very little about the significant elements that truly influence interview anxiety and the overall experience (e.g., cognitive load, discomfort [58,59]), but we also need a solid understanding of how each factor affects the interviewee's external performance (e.g., verbal expressions, eye contact, body movements).For graduates to be competitive in the job market and for employers to pick prospective workers, they must demonstrate strong performance during job interviews and have practical anxiety management abilities.Therefore, a study on factors contributing to people's interview anxiety is required to create successful training programs and develop tailored therapy approaches.
In terms of anxiety, previous studies have found that visual display [43], interview questions [29], interviewer's attitude [42], timing [53] and preparation [20] can all have an impact on interviewee's anxiety.In addition to feelings of anxiety, other studies have investigated the impact of virtual reality on overall experience (e.g., discomfort, eyestrain, psychosocial stress).Some studies have found that apparatus has an impact on quality of experience, with HMDs experiencing higher eyestrain and visual discomfort than PCs [56]; some have found the V-TSST (virtual environment versions of the TSST) is effective at inducing psychosocial stress which can lead to poor physical and psychological health outcomes, though the magnitude of this response is less than the traditional TSST [31]; Kothgassner et al. have indicated that the perceived social presence did not differ over time in the VR TSST conditions as the main effect of time was not significant [41].
Meanwhile, interviewees' performances have also been proven to be influenced by factors within a virtual environment.For example, prior studies have identified the relationship between the number of completed virtual interviews and improved interviewing skills or performance as the mechanism for getting a job offer [55]; the results of M Barreda-Ángeles et al. have shown that, compared to the neutral audience, the negative audience elicited increases in skin conductance level and heart rate variability, decreases in voice intensity, and a higher ratio of silent parts in the speech, as well as a more negative self-reported valence, higher anxiety, and lower social presence [6].
As such, we developed the VRIS, where the above factors were introduced and investigated within an orthogonal design to examine their significance separately.
In this paper, we question whether the above five factors (I) RE-ALISM; (II) QUESTION TYPE; (III) INTERVIEWER ATTITUDE; (IV) TIMING; and (V) PREPARATION will indeed significantly influence interviewee's anxiety during a job interview.We hypothesized that all of these factors could potentially have a pronounced effect on interview anxiety.Consider the above mentioned, the current study proposed five research questions (RQs) with the following hypothesis (H): 1. RQ1: How do different interview questions affect the interviewee?
H1: Professional interview questions can cause more anxiety, worse overall experience, and poorer performance than personal questions.
2. RQ2: How does preparation for an interview affect the interviewee?
H2: Being unprepared for an interview can be more nervous and uncomfortable than being well prepared for the content of the interview.
3. RQ3: How does the timing of the answers to interview questions affect the interviewee?
H3: Timed answers can be more nerve-wracking than untimed answers, leading to worse performance.

RQ4: How do different levels of realism affect the interviewee?
H4: A more realistic scenario would make interviewees more nervous and reduce their eye contact with the interviewer.

RQ5: How does the interviewer's attitude affect the interviewee?
H5: Compared to an interviewer with a positive attitude, an interviewer with a negative attitude will elicit increases in skin conductance response, decreases in eye contact, a higher cognitive load during the interview, as well as more unsatisfactory performance.
We then conducted an orthogonal experiment design with eight different interview conditions using a mixed level L 8 (4 1 × 2 4 ) orthogonal table including all five factors above to examine the significance of each factor on interview anxiety, overall experience, and performance.In addition, we measured the interviewee's electrodermal activity(EDA) during the interview, given that increased EDA has been associated with anxiety.Finally, in all eight conditions, we asked the interviewee to fill out a self-rated anxiety questionnaire and NASA-TLX criteria once the interview was completed.The interviewer would also rate the interviewee's performance during the interview.Followed by the mixed-effects model and the associated post-hoc analysis for the data collected from questionnaires and electrodermal activity, we found that all five of the above factors had different levels of influence on interviewees' anxiety, overall experience and interview performance, among which TYPE OF IN-TERVIEW QUESTIONS had the most significant impact, in particular, the professional questions significantly increased the interviewee's anxiety, discomfort, electrodermal activity and cognitive workload.
The proposed work aims to identify anxiety-stimulating factors during job interviews in virtual reality and provides detailed insights into designing future VRIS.

RELATED WORK
Relevant prior work includes studies of psychotherapy, online interview systems, and virtual reality interview training.This section summarizes those works separately.

Psychotherapy in Virtual Reality
Since the inception of virtual reality, several psychotherapy pieces of research have been undertaken in virtual settings, and the idea of employing them to treat psychiatric illnesses has been investigated.As social concern about social phobia gradually increased, North et al. first used virtual reality exposure therapy for social phobia: they developed a virtual auditorium that could be triggered in real-time [50].The audience and audio clips would respond to the experimenter's voice in the auditorium, prompting the experimenter to speak louder and more loudly.Based on feedback from the questionnaire results, the experiment demonstrated that Virtual Reality Exposure Therapy (VRET) effectively mitigated anxiety symptoms during presentations.Although the feedback from the experiment mainly was auditory information, it highlights the need for further research on the relationship between VRET and human psychology.
Further studies have implied that there are three preconditions for the treatment of anxiety disorders through VRET, including immersion, anxiety, and presence [34].Parsons et al. reported 21 case studies [51] confirming that VRET can effectively treat arachnophobia (Garcia et al. [21]), flying phobia (Banos and Botella [5]), phobia of public places ( Botella et al. [8]), acrophobia(Coelho et al. [13]) Some researchers have emphasized social cognition interaction training for autistic youngsters.Didehbani et al. designed a virtual reality social cognitive training to improve the social skills of children with ASD [14].The research findings, which tested emotion detection, social attribution, attention, and executive functioning, revealed that the virtual reality platform successfully ameliorated the social deficits typical of ASD.
In addition to virtual reality social cognitive training for children with ASD, Burke et al. designed Virtual Interactive Training Agents (ViTA) to develop social skills and reduce anxiety in young people with ASD and other developmental disorders [11].According to studies, experimenters who have received ViTA training are much better at recognizing their abilities, promoting themselves, selfpromoting, self-advocating, and responding to situational queries.In contrast, our study relates virtual reality to job interviews, the most prevalent situation in business practice.In the era of the metaverse, we anticipate migrating job interviews to the virtual realm.At the same time, our research exposes user behaviors in the settings of psychological factors and virtual job interviews.

Virtual Reality and Online Interview Systems
Due to the global pandemic and rising economic costs, firms are progressively incorporating online video interviews into their hiring and appointment procedures.In addition, the rapid development of artificial intelligence (AI) has driven the applications of automatic scoring, where AI interview systems can score interviewees' hard and soft skills based on the content of their answers, facial movements, eye contact, and speaking tones in the video [30], which successfully increase the effectiveness of interviews and assist professionals in assessing their candidates more quickly.
Online interview formats have driven the development of online interview simulation training.For example, Aysina et al. created Job Interview Simulation Training (JIST) to improve psychological preparation for job interviews among the pre-retirement unemployed [4].The experiment showed that having interviewees practice interviews over and over in a stress-free environment made them much more psychologically ready for the actual interview(s), which could help demonstrate the relationship between JIST and increased re-employment among pre-retirement job seekers in the future.
Other studies have found that highly interactive virtual reality role-play training based on behavioral learning principles is more effective than traditional role-play training in training other types of interpersonal skills.Smith et al. developed a study to test a role-play simulation "virtual reality job interview training" (VR-JIT) for the feasibility and effectiveness of improving job-related interview content and interviewees' performance-related interview skills in individuals with ASD [54].Their findings demonstrate that VR-JIT can improve job interview skills in individuals with ASD.Among current online interview systems, our research investigates the viability of conducting job interview preparations in immersive settings.Additionally, we specifically focus on the effect of different factors on interviewees' anxiety levels.

Orthogonal experimental design
The orthogonal experimental design is an efficient method to study the effect of multiple factors within one system simultaneously compared to the conventional methods of studying each factor separately by selecting one of the variables to change its parameters and fixing the rest of the variables.It selects some representative combinations from a full-scale test according to the modern algebra of Galois theory [1,35].The orthogonal table based on orthogonality ensures that the effects of all factors are obtained with a minimum number of trials.
We identified five independent variables that potentially affect interviewee's anxiety levels during a job interview in virtual reality: (I) LEVEL OF REALISM (4 levels); (II) TYPE OF INTERVIEW QUESTIONS (2 levels); (III) INTERVIEWER ATTITIDE (2 levels); (IV) TIMED OR UNTIMED ANSWERS (2 levels); and (V) WITH OR WITHOUT PREPARATION (2 levels).In order to determine the relative importance of these five variables and find out what factor most stimulates the interviewee's anxiety, we constructed an orthogonal fractional factorial design to arrange the tests by using a mixed level L 8 (4 1 × 2 4 ) orthogonal table with all these five variables in a total of eight sets of conditions, as shown in Table 1.For example, to conduct the seventh experimental group, the participants had to test with the Oculus Quest 2 HMD in a virtual job interview environment and be asked ten professional questions without preparation before the interview began.Each question will be timed 30 seconds for the answer by an interviewer with a negative attitude (e.g., passive body movements and negative verbal feedback).
The first independent variable (I) LEVEL OF REALISM corresponded to the visual display of the interviewer.We were aware that a video conference(e.g., through Skype, Tencent Meeting) would be considered quite "real" as an online meeting using video conferencing is generally a well-accepted form of a remote interview.In our context, "realism" refers to the level of immersion.Previous studies have found that higher visual display levels provoke more anxiety and the sense of presence [43].Therefore, in order to create four different kinds of realism, we set four conditions representing a continuous spectrum of immersiveness, i.e., the least to the most immersive, including an interviewer presented by video conference on a laptop computer (PC), a low-poly and cartoon-like 3D avatar representing the human interviewer (VR1), a realistic interviewer with a high fidelity 3D human avatar (VR2) and a face-to-face real human interviewer (REAL) (see Fig. 2).In the experiments under PC condition, interviewees conducted video interviews with live interviewers via video conference.In the experiments under VR1 and VR2 condition, animated sequences of the interviewer's avatar were played automatically by the software in a VR headset, and a conversation with a live interviewer was conducted via voice conference.Each of the interviewers' avatars had a full-body presentation, but each of the 19 interviewees only had both hands as a physical presence in the virtual environment.In the experiments under REAL condition, the interviewee had a face-to-face interview with a live interviewer (see Fig. 1).The second independent variable (II) TYPE OF INTERVIEW QUESTIONS corresponded to two categories of questions: professional inquiries and personnel interview questions, since the previous study has also shown that different types of interview questions have an impact on interviewee's performance [29].The professional inquiries consisted of the professional knowledge the interviewee has learned in college, which involved computer networking, operating systems, programming and algorithms, linear algebra, database, and principles of computer composition.Each question had the corresponding correct answers and would examine the interviewee's memory, logical thinking ability, reaction speed, and mastery of professional knowledge.These professional questions were based on university final exams, internship interviews, job interviews, and interview questions for the graduate school review.Since there was no established standard for what kind of job the participants were applying for, the personnel questions used in the simulated interviews were general and typical job interview questions.The questions were divided into five categories: basic personal information, personality assessment, emotional control, organizing and planning skills, and creative questions.The pressure and difficulty of the questions in these five categories increased, and they were eventually divided into four sets of ten questions each, with a similar level of difficulty and no repetitions.The answers to these questions were mainly based on the interviewee's review and summary of their experience and evaluation of themselves and were open-ended.Moreover, for each of the eight groups of interviews, we prepared different interview questions accordingly, so there were eight separate sets of interview questions in total, and each category(i.e., professional and personnel) had four sets of interview questions of the same difficulty with each set including ten interview questions.All the interview questions were in the additional materials.
The third independent variable (III) INTERVIEWER ATTITIDE corresponded to the interviewer's attitude(mainly body language and tone of voice) and response to the interviewee's performance during the interview process.A previous study discovered that the participants exhibited more anxiety by the attitude of virtual avatars than the avatar's level of realism [42].We designed two types of interviewers with positive and negative attitudes.Both interviewers would give interviewees real-time responses based on their performances.The positive interviewer would respond with positive feedback on the interviewee's answers (If the interviewee did well, the interviewer would respond "Excellent, exactly right."If the interviewee did not perform well, the interviewer would reply "It's okay, there is no rush, please take your time to think about it.")with positive animations ( e.g., greeting, handshaking, listening with full attention, nodding, acknowledging, see Fig. 2).The negative interviewer would start the interview by emphasizing "I will only ask all the questions once and will not repeat them, so listen carefully."During an interview, the interviewer would give negative feedback on the interviewee's answers (If the interviewee is unable to answer or answers incorrectly, the interviewer will respond "You can't answer such a simple question?" or "Totally wrong, it's all learned knowledge."or "Organize your language more clearly, time is up."), and with negative animations (e.g., shaking head, yawning, pouting, rubbing shoulders, looking around impatiently, talking on the phone, or texting, see Fig. 2).

The fourth independent variable (IV) TIMED OR UNTIMED AN-SWERS corresponded to whether to time each interviewee's answer.
There is also past literature on the effects of timed and untimed questions on student performance and anxiety [49,53].In the case of a task without time limitation (i.e., no timing), each interview question could be answered for any length of time; in the case of a time-sensitive task (i.e., timing), for each interview question, the interviewer would time the interviewee for 30 seconds and interrupt the interviewer's answer as soon as the time is up.
The fifth independent variable (V) WITH OR WITHOUT PREPA-RATION corresponded to whether the interviewee was given 5 min-utes to prepare for that round of the interview before the job interview began.During the 5 minutes, the interviewee could review the ten interview questions for that round and could search for information or memorize the relevant materials distributed by the staff to structure their answers in advance.

Apparatus
Fig. 1 shows our experiment setup.The whole system had four different settings for the four levels of realism.In all conditions, participants were asked to wear an E4 wristband on their left wrist, and a smartphone on the side would display the real-time physiological data acquisition without being seen by the participant.For the video-conferencing condition, interviewers conducted video conferences with the interviewee via Tencent meeting on a laptop computer with Windows 10 operating system, NVIDIA GeForce RTX 3060 GPU, and a 16.1" monitor with a resolution of 1920 × 1080.The cartoon VR and realistic VR conditions used the same experimental equipment setup.Interviewers conducted a Tencent meeting with the interviewee on a laptop computer while wearing a Meta Oculus Quest 2, a standalone headset with an internal, Android-based operating system, graphics of 1832 × 1920 pixels per eye at 90 Hz, and a 6 GB of LPDDR4X RAM processor.Through a fiber-optic link cable, we connected the socket of the VR headset to the USB socket of the laptop computer, which allowed us to cast the scene rendered in the VR headset directly to the laptop computer through the SideQuest application.Then, the picture on the laptop computer would be screen shared with the interviewer through Tencent meeting so that the interviewer could give verbal feedback in real-time according to the animations of the avatar and the interviewee's performance.The interviewee could also hear the interviewer's voice reply in the Tencent meeting on the laptop computer linked to the VR headset.For the onsite meeting condition, interviewees had a face-to-face interview with real human interviewers in the real site as Fig. 3 demonstrated.

Application
The Unity applications consisted of scenes and interviewers' avatars, and the Unity3D version we used for developing the application is 2020.3.25.In order to focus only on the realism of the avatar itself, we excluded the interference of different environments by making them the same as the physical environment.We built a virtual interview scene based on a real interview scene by using the abundant 3D models in the Unity Asset Store.Fig. 3 demonstrates real interview site and virtual interview scenario.We designed the corresponding avatars based on two real females (see Fig. 2).In order to present avatars with different realism (cartoon and realistic), we used two different modeling approaches.The cartoon avatar was developed by using Ready Player Me1 , a free web platform that supports users to automatically generate an avatar that resembles a real person by uploading a selfie.The realistic avatar was created by using Avatar SDK2 , which is an advanced avatar creation toolkit using AI to create photorealistic and lifelike 3D avatars from selfie photos.In order to rig the skeleton and animate the avatars, we used the Mixamo auto rigging tool to rig and animate our characters by uploading the models to the Mixamo website and selecting the animations needed by the corresponding avatars for downloads (e.g., handshaking for the positive interviewer and pouting for the negative interviewer).Avatars of interviewers were programmed to achieve various autonomous animations using the corresponding animator controller.With the animator components added to our avatars, the avatars would automatically play the pre-customized animation sequence with natural transitions.Using the Unity XR Interaction Toolkit 3 , we built our application on the Android platform as "apk" files, which would run on an Oculus Quest 2 headset.

Participants
The participants were recruited from our university campuses with voluntary consensus.They were undergraduate students facing internship, job, and graduate school review interviews in the next year or two.They were, therefore, likely to be the primary users of the VRIS.Nineteen university students (M = 11, F = 8) participated in this experiment, aged between 19 and 21 (M =19.9, SD=0.64).Participants received ¥100 each for their participation.Each participant performed the interviews in all eight experiment conditions within four days (twice daily).Each participant was interviewed twice a day (i.e., once in the morning and once in the evening) with a 12-hour interval between the two sessions.Each interview is approximately 5-10 minutes long.The serial numbers of the experiment conducted each time were counterbalanced to reduce sequence effects.

Procedure
The flow chart of the experimental protocol is represented in Fig. 4. Experiments were conducted in a controlled studio setting where the room temperature was set to 25°C-27°C with indoor air conditioning.The studio was also designed to resemble a pleasant, calm lounge where users may use VRIS to mimic job interviews.We supposed that many real-life interviews would take place in a similar scenario.Before all interviews began, participants were asked to fill out the Measure of Anxiety in Selection Interviews (MASI) and General Self-Efficacy (GSE) scale in order to filter out exceptional cases of being too nervous about the interview (or even suffering from a related illness) or not nervous at all.In our interview experiment, participants were told to imagine that this was a real job interview scenario and that each interview question asked by the interviewer needed to be thought through and answered carefully.During the interview, a staff member sitting beside the interviewee would collect physiological data from the bracelet.At the end of each interview, the interviewees were asked to fill out two questionnaires (a NASA Task Load Index (NASA TLX) and a self-assessment anxiety questionnaire) based on their experience during the interview they just completed, and the interviewer will rate the interviewees' performance on a score sheet for the round.

Data Analysis
In order to comprehensively assess interviewer anxiety, overall experience, and interview performance, we combined subjective questionnaires and objective physiological signals, where the questionnaires were divided into interviewees' self-perceptions and interviewers' observations and evaluations; and we used electrodermal activity, in particular skin conductance response, as a quantitative approach to measure interviewees' anxiety.
Questionnaires: Interviewees were asked to fill in a paper version of the questionnaires to measure self-rated anxiety both before and after the interview.Before starting the experiment, each interviewee was asked to fill in personal information, the Measure of Anxiety in Selection Interviews (MASI), and the General Self-Efficacy (GSE) scale.Then after each set of experiments, interviewees were asked to complete the NASA Task Load Index (TLX) and an anxiety selfassessment questionnaire.
The Measure of Anxiety in Selection Interviews (MASI) was used to measure interviewees' self-rated interview anxiety.MASI is a concise and practical measurement tool that comprehensively assesses multiple aspects of job interview anxiety [47].The MASI includes measures of interview anxiety across five dimensions: communication anxiety, appearance anxiety, social anxiety, performance anxiety, and behavioral anxiety.Each dimension includes six questions, for a total of 30 questions.More than 35% of the answers to the MASI questions scored above three on the five-point response scale (1= strongly disagree, 5= strongly agree), indicating that interviewees displayed considerable anxiety in at least some aspects of the interview process [62].Based on this criterion, i.e., "11 questions with a score greater than 3", the eligible samples of MASI results were screened and aggregated, with nine interviewees experiencing substantial interview anxiety, representing 47% of the total number of interviewees.
Self-efficacy is a central concept in Bandura's social cognitive theory, where "efficacy expectations are presumed to influence the level of performance by enhancing intensity and persistence of effort".Further examination revealed that self-efficacy significantly influences human behavior (e.g., stress reactions, self-regulation, coping, achievement striving, career pursuits) [60].A 10-item version of the General Self-Efficacy (GSE) scale [37] of which the total score range is 10-40 using a 5-point Likert metric with options "1 = Not at all true, 2 = Hardly true, 3 = Moderately true, 4 = Exactly true" was used to measure the interviewees' ability to deal with various stressors in their life and, in particular, to have control over their actions in the interview setting.The Chinese version of GSE [61]has been validated by Zhang and Schwarzer to have good reliability and validity.People with different levels of self-efficacy feel, think and act differently.At the level of feelings, self-efficacy is often associated with depression, anxiety, and helplessness.Generally, a score of 20 or below on the GSE scale indicates low self-efficacy.After analyzing the data, we calculated that the average score of the participants was 25.21, and 18 participants scored between 20 and 30, i.e., they met the criteria of "high self-efficacy," while only one participant scored 17 indicating "low self-efficacy and sometimes low confidence ." National Aeronautics and Space Administration Task Load Index (NASA TLX) [27] is a subjective workload rating scale.It uses six dimensions to assess workload: mental demand, physical demand, temporal demand, performance, effort, and frustration.Interviewees are asked to rate each dimension on a twenty-step bipolar scale with a score from 0 to 100 (0= Very Low, 100=Very High).NASA TLX has been proven to have the highest factor validity rated the best in its ability to represent workload in the four subjective workload scales [32].This scale measures the interview's mental, performance, and psychological effects on the interviewer across different experimental groups.
We also designed a self-assessment questionnaire with three questions for interviewees to rate their levels of anxiety, subjective discomfort, and eye contact avoidance, as studies have found that less eye contact is strongly associated with the interviewee being more anxious, more uncomfortable and less well behaved, which equates to more nervousness [33].The scale of both anxiety and discomfort ranged from 0 to 100, and interviewees were given four options to measure their eye contact: no eye contact, glance, gaze, or gaze with a smile.Each choice was eventually mapped to a scale of 0-100 for analysis.
Prior research has suggested that interviewers can detect interview anxiety with reasonable accuracy [18].Therefore, for interviewers to evaluate each interview session based on the interviewees' performance, a performance rating scale with three dimensions, each with a score range from 0 to 100, was applied to ascertain the participants' perceptions of their anxiety and behavior during the interview.First, the interviewee's nervousness was scored through non-verbal behaviors, such as shaking hands, stiff movements, or demonstrating little eye contact.Second, the interviewee's performance was scored on their answers' accuracy, logic, and time control.Third, the inter-viewee's communication skill is scored by the ratio of pauses, errors, stammering, slurring, and verbal chanting throughout the interview.The self-assessment questionnaire and performance rating scale can be found in the appendix.
Physiology Measures: Physiological measures related to feelings of anxiety are mainly composed of EDA(Electrodermal Activity).We used the Empatica E4 bracelet with two electrode sensors to measure these data.It captures conductance (inverse resistance) through the skin, passing a minimal amount of current between two electrodes in contact with the skin to obtain EDA signal data.When experiencing emotional activation, increased cognitive load, or physical exertion, the brain sends signals to the skin, the pores begin to fill below the surface, and conductance increases in a measurable way [25].One component of EDA is the phasic component, which refers to the faster-changing elements of the signal -the Skin Conductance Response (SCR) [10].Tonic data in EDA signals such as SCL(Skin conductance level) levels vary according to individual differences and changes in the experimental setting and thus need to record baseline for further analysis.Yet, our analysis mainly focused on ER-SCR(Event-related skin conductance response), which is one kind of phasic data when specific events (e.g., visual stimuli or stressful events) induce corresponding SCRs where individual differences and changes in time and environment play little role.So recording the baseline is not mandatory when analyzing ER-SCR in our experiment.The processing for extracting the ER-SCR from the EDA data is as follows: (1) Raw data collection: The data from the E4 wristband website was collected right after each interview.And then, raw EDA data with Unix timestamps were converted to the local time of the lab.
(2) ER-SCR extraction: we used the neurokit2 package 4 , a python toolbox for neurophysiological signal processing.Through functions neurokit2 provided, we could feed raw EDA signal and got returns like the number of occurrences of Skin Conductance Response (SCR), the mean amplitude of the SCR peak occurrences, and other SCR information for future analysis [46].

RESULTS
All participants completed the orthogonal experiment, and therefore we opted for the mixed-effects model (also named "multilevel model" or "hierarchical model") to analyze the data from repeated measures [7].The mixed-effects model includes fixed effects associated with the response variable and random effects not related to the response variable.In order to get over the effect of individual differences on the interview experience, we set it as a random effect in this model.Other variables presented in Table 1 are fixed effects.Mixedeffects model can study whether one factor has a higher impact on the response variable compared to the other factors.In comparison, the post-hoc analysis can compare the effect of different levels in one factor.Therefore, we carry out the analysis of the mixed-effects model and post-hoc analysis.Data analysis was performed in R, and no significant interaction effects of any combinations of independent variables were detected.The significance level was set to .05.We also used corrections when performing the post-hoc analysis.

Overall Experience
To get a full picture of the interviewee' experience, we collected their Cognitive load, Discomfort and Avoidance of eye contact through NASA-TLX and subjective self-assessment questionnaires.

Interview Performance
The interviewers' feedback on the interviewees' performances and ability includes Communication skill(e.g., the ratio of pauses, errors, stammering, slurring and verbal chanting) and Overall performance(e.g., accuracy, logic and time control); a better communication skill means more fluent, accurate and constructive oral presentation to interview questions while a better performance indicates more correct, logical and adequate answers within time limits.The mixed-effect rANOVA results are given in Table 8.

DISCUSSION
To answer the five research questions and the corresponding hypotheses about the effect of different variables, we found that each of the factors played a specific role, with Question Type having the greatest impact, followed by Interview Attitude, Preparation and Realism all having approximately the same effect, and finally Timekeeping having the smallest effect.Our findings both negate and support some of the hypotheses.Our results showed that professional questions, being unprepared, timed answers, and negative interviewers can indeed cause more anxiety than their respective opposites.However, in terms of Realism, it was predicted that a greater level of realism would result in greater anxiety, but this turned out not to be the case, where Realistic VR was found to have the greatest anxiety-inducing effect.
Regarding the independent variables, we found that Question type has the most significant effect among other independent variables.In particular, professional questions can lead to higher anxiety on each dependent variable and dimension.For example, considering anxiety, professional questions can lead to more self-perceived, SCR-embodied, and interviewer-rated anxiety; for overall experience, professional questions can cause more discomfort, more cognitive load, and less eye contact; and for interview performance, professional questions can cause decreased communication skills.The only exception is the overall performance on which Question Type has no significant impact.There has been little research on the impact of question types on interview anxiety; also, the impact of many question variables remains poorly understood, for example, question variables including whether the question is open or closed [23], experience-based or situational questions [15], 'lowerorder' or 'higher-order' thinking [9].Our research focused on professional and personal questions related to job interviews.Our data suggest that professional questions can cause more cognitive load than personal questions.Susan Gee et al. [23] mentioned that a recall question requires more cognitive processing than an answer to a recognition question, which offered valuable insight into our study.The professional questions in our survey required a memory search and thus can be defined as recall questions.In contrast, personal questions with specific cues provided were very familiar with recognition questions.However, Susan Gee used a sample of 157 children aged nine to thirteen, and our experiments mainly targeted college students.Next, Interview Attitude, Preparation, and Realism all have considerable effects on each dependent variable.Apparently, a negative interviewer can cause more SCR-embodied anxiety, interviewer-rated anxiety, less eye contact, and worse performances.Joung Huem Kwon's finding considered that anxiety level was affected more by the attitude of the virtual interviewer than its level of realism [42], whereas our findings do not support that the attitude's impact necessarily outweighed the level of realism.Joung Huem Kwon's experiment only focused on virtual humans and did not include a real human interviewer.Also, the indicator of the interviewee's anxiety used in Joung Huem Kwon's study was simply physiological measurements(i.e., the percent rate of gaze fixation and eye blink).In a similar study, Patrick Gebhard [22] designed two types of virtual recruiters: a sympathetic one with friendly facial expressions and a warm tone; and a demanding one with unfriendly facial expressions and a cold tone.They found that the participants felt that the demanding character induced a higher stress level than the understanding character and felt less comfortable, which is in line with our discovery in terms of the effect on the overall experience.Similarly, no preparation before an interview can lead to more self-perceived anxiety, more discomfort, a higher cognitive load regarding frustration, and worse performance.This is consistent with a previous study which suggested that job-seekers perform better in job interviews when they are better prepared and have rehearsed answers to common interview questions, and the experiential practice of mock interviews may enhance students' preparation for real-world job interviewing [26].The influence of Realism is much more complicated since this variable has four levels, mixed-effect model indicates that Realism has a significant impact on self-perceived anxiety, discomfort, cognitive load, eye contact, and communication skills, with further post-hoc analysis, we discovered that Realistic VR induces more self-perceived anxiety, more discomfort, higher cognitive load regarding frustration, and less eye contact than PC and even Real Person, which is against our prior hypotheses that Real Person should have caused more anxiety than Realistic VR, yet, it is reasonable and in line with many previous findings that VR is effective in inducing stress [17,57,63].Also, there is no significant difference shown in any dependent variables between Realistic VR and Cartoon VR except for Realistic VR can reduce eye contact compared to Cartoon VR, which aligns with Jean-Luc Lugrin's finding that graphical details or level of realism for avatar visual display reveal no significant differences [45].Lastly and unexpectedly, Timekeeping has the least impact, only shown in interviewer-rated anxiety and performance; specifically, keeping time increases interviewer-rated anxiety and cognitive load considering physical and temporal demand, also leads to worse performance.Nevertheless, the ability to finish tasks under time urgency is crucial; thus, a previous study has validated a Virtual Training System for improving time-limited decision skills and learning performances [52], while our research mainly focused on interview performance instead of learning performance as the previous study, both studies indicated the potential of virtual reality as a training tool.
Regarding the dependent variables, results indicate that Anxiety is greatly influenced by Question type, secondly Interviewer attitude, and lastly Timekeeping, Preparation and Realism; while Overall experience is greatly influenced by Question type, Preparation and Realism, secondly Interviewer attitude and Timekeeping; yet Performance is effected by all five variables and with almost the same level of influence with Preparation slightly having more impact.We further investigated the association between dependent variables and found consistent associations between self-perceived anxiety, SCR-embodied anxiety, and interviewer-rated anxiety, especially in Question Type and Realism where Realistic VR tends to induce more anxiety than PC and Real Person.However, we found an inconsistency between self-perceived performance collected in NASA-TLX and interviewer-rated performance; interviewees tend to believe their performances are influenced by Question type, Preparation and Re-alism while interviewers think that their performances are mainly affected by Interviewer attitude, Timekeeping and Preparation even though both sides found Preparation influenced performances.The inconsistency might be because the interviewer had a full-body avatar.However, the interviewee only had both hands as a physical presence in virtual reality, where the interviewers could only judge the interviewee's voice without facial expressions or eye contact to rate their performances.Therefore, the interviewer's evaluation might need to be completed in virtual reality.For the interviewer to evaluate the interviewee's performance more comprehensively, the interviewees could also interact with more expressive avatars, such as a customized avatar with facial and motion capture that can deliver their feelings, facial expressions, and body movements in real-time.The previous study also showed that facial animation could increase the enfacement illusion and avatar self-identification [24].
Our findings have a few implications for the optimization and development of VRIS: (1) Professional questions and an interviewer with a negative attitude can remarkably induce anxiety during an interview; (2) VR interviews can indeed be effectively used to produce a similar interview experience, inducing the same or even more anxiety and discomfort than real person interviews to the interviewees; (3) low-fidelity avatars can provide the same user experience, anxiety level, cognitive load as the high ones while having lower requirements for computational performance, time latency, network load, and hardware; (4) preparation is still the critical element to have good performance; (5) during an interview, self-perceived anxiety and the interviewer's evaluated anxiety are approximately the same which means that the interviewer can detect the interviewee's tension level well

LIMITATION
The quantification of anxiety is tricky, and electrodermal activity responses may not accurately capture the transient nature of anxiety and are influenced by irrelevant factors, including food and drink intake.Thus for future work, we consider using eye movements, facial expressions, voice intonation, physical motions, or nerve center activity to quantify anxiety comprehensively.In addition, the influence of long-term studies is not reflected in our short-term, sequential experiments; a long-term consecutive interview study may reveal additional implications for designing VRIS.Moreover, we can further investigate the relationships between dependent variables to identify whether the higher the anxiety level, the worse the experience or interview performance must be.

CONCLUSION
We developed and evaluated a virtual interview simulator to investigate the possible causes of anxiety in job interviews within VRIS.By conducting an orthogonal experimental design with eight job interview conditions and evaluating it with 19 college students to assess the significance of five possible anxiety-inducing factors, our study sheds light on understanding the fundamental factors creating people's anxiety and influencing experience and performance during an interview.Results confirm the significance of specific variables and emphasize the need to consider question types in VRIS.We also identified the effectiveness of the VR interviews regarding anxietyinducing and overall experience compared to real-person interviews; therefore, VRIS could be a promising tool in training and practicing for interviews.

Interview questions and questionnaires
Our supplementary materials include eight sets of interview questions (i.e., 4 sets of personal questions and 4 sets of professional questions), a "Self-assessment Questionnaire" and a "Performance Rating Scale" see 5 .

Figure 2 :
Figure 2: Four levels of realism from the least to most immersive(from left to right): video-conferencing(VR1), cartoon VR (VR1;), realistic VR (VR2;), onsite meeting (REAL).Two types of attitude: positive and negative(from top to bottom).

Figure 4 :
Figure 4: Flow chart of the experiment.(1) Repeat two times in one day (one in the morning and one in the evening with a 12-hour interval).Each participant performed the interviews in all 8 different experiment conditions within 4 days.(2) Choose one of the experiment setup for each interview.

Table 1 :
Orthogonal design with multi-factors and mixed levels.

Table 7 :
Post-hoc analysis for factors affecting interviewee's discomfort and eye contact using t test (MD=mean difference, df=133).