A new performance-based measure of personality functioning impairment: development and preliminary evaluation of reliability and validity

Personality functioning impairment is at the center of many dimensional models of personality. Available measures of personality functioning impairment are limited to self-report, clinician-/informant-rated, and interview methods. Although researchers have begun investigating established performance-based instruments’ potential for assessing personality functioning impairment, administration and scoring of these instruments is complex and the latent variables they measure diverge from personality functioning impairment as described in the ICD-11 and the Alternative Model for Personality Disorders (AMPD) of the DSM. We address this absence by developing and psychometrically evaluating the Level of Personality Functioning Scale-Questionnaire-based Implicit Association Test (LPFS-qIAT). The LPFS-qIAT’s psychometric properties were evaluated across four studies, producing initial evidence supporting the new instrument’s reliability as well as its convergent, discriminant, and criterion-related validity. As the first performance-based measure of personality functioning impairment that aligns with the AMPD and, to a degree the ICD-11, that is easily administered, scored, and interpreted, the LPFS-qIAT shows potential to become a valuable tool in both research and clinical practice. Supplementary Information The online version contains supplementary material available at 10.1007/s44192-024-00059-4.

(SCID-AMPD Module 1) [36], the Clinical Assessment of the Level of Personality Functioning Scale (CALF) [37], and the PDS-ICD-11 Clinician-Rating Form [38].Designed as (semi-)structured interviews, these instruments range in duration, from ~ 30 min to upwards of 45-90 min.Multiple studies have demonstrated their criterion-related and predictive validity, while internal consistency is regularly good and interrater reliability has ranged from 0.59 to 0.96 across the various instruments and their subscales.
As shown by this brief summary, there are myriad self-report, interview, and clinician-/informant-rated measures of personality functioning impairment.There has yet to be a valid performance-based instrument specifically developed to measure personality functioning impairment as defined in the AMPD or  Although researchers have demonstrated the potential for using the Rorschach [39,40] and Thematic Apperception Test [41,42] to assess personality functioning impairment, reliable and valid administration and scoring of these tests is time-intensive, requires extensive training, and is often incompatible with the type of remote administration required for online studies.Moreover, the constructs (currently) assessed via the Rorschach and Thematic Apperception Test conceptually diverge from personality functioning impairment as operationalized in the AMPD and ICD-11.There is a discernable absence of a performance-based measure of personality functioning impairment that is psychometrically robust, aligns with the AMPD and ICD-11, and can be administered and scored just as quickly, easily, and remotely as self-report tests.The lack of such an instrument is deleterious for both research and clinical practice because overreliance on a single method of measurement can lead to flawed empirical and clinical conclusions based on biased or incomplete representations of personality (see [43][44][45][46] for discussions).There is a clear need for a psychometrically sound performance-based measure of personality functioning impairment that aligns with the AMPD and ICD-11 and that can be easily, quickly, and remotely administered and scored.

Performance-based measurement methodology and the questionnaire-based IAT
Performance-based measures are a broad class of instruments wherein outcomes are derived from an individual's unrehearsed performance on a (semi-)structured task designed to tap behavior and patterns of responding that reflector are influenced by-the latent construct being measured.Information-processing measures are a subtype of the performance-based class.These tests require individuals to complete a reaction-time task by which latent constructs are assessed with the understanding that attributes of the latent construct influence the individual's task performance in measurable ways via cognitive and other psychological processes largely occurring outside their conscious awareness.The Implicit Association Test (IAT) [46] is one such instrument.The IAT and its variants measure the strengths of implicitrelatively automatic-associations between concepts using a double-categorization reaction-time task (see [46] for further explanation).Intraindividual differences in performance speeds (latencies) between critical test blocks are the basis of IAT scores, with scores representing the strength of automatic associations between the measured concepts in the individual's mind.With regard to measuring personality functioning impairment, IAT scores would be interpreted as the degree to which the individual associates their concept of themselves with the latent variable of personality dysfunction.
Findings from studies of the IAT's psychometric properties are fairly variable.Meta-analysis [47] produced an average internal consistency of α = 0.80 for the original IAT method, with average alphas of IAT variants ranging from 0.60 to 0.79.Test-retest reliabilities for the IAT averaged r = 0.50 across 58 studies, a result Greenwald and Lai [47] partly attributed to reliability reductions caused by operational requirements of certain research situations, such as those seen with online data collection.Regarding construct validity, meta-analytic studies have found aggregate correlations between the IAT and self-reports measuring analogous constructs to range from r = 0.116 to r = 0.361 and aggregate correlations between IAT measures and external criteria to range from r = 0.097 to r = 0.274 [48][49][50][51][52].These effect sizes are comparable to those commonly observed between well-validated performance-based measures (e.g., Rorschach) and scores obtained via selfreport or informant-rating scales (correlations typically in the r = 0.20-0.30range) [53][54][55][56].Taken together, psychometric benchmarks for a new performance-based measure of personality functioning impairment based on the IAT methodology can be judiciously set at (a) internal reliability coefficients of approximately α = 0.80 or better, (b) test-retest reliability better than r = 0.50,2 and (c) correlation coefficients representing convergence with validated self-report and informantrating tests within the r = 0.20-0.30range or stronger, in addition to (d) evidence of discriminant validity.

Research
Discover Mental Health (2024) 4:6 | https://doi.org/10.1007/s44192-024-00059-4 Although a personality functioning impairment IAT might meet the psychometric benchmarks enumerated above, many of the IAT's psychometric shortcomings are worsened when it is used to measure self-concepts due to various limitations (e.g., problematic response stimuli, vague and not easily translatable interpretations of scores, and restrictions on the complexity of the constructs that can be measured) [57].The Questionnaire-based IAT (qIAT) [58,59] was developed to address these specific barriers.The qIAT procedure is consistent with that of the original IAT but supplements the weaknesses listed above in several ways.Most notably, the qIAT uses items of validated self-report tests as stimuli and creates pairing combinations that parallel self-report procedures (e.g., true/false).Rather than single words or images, qIAT stimuli are statements resembling those found on self-report tests (e.g., "I often think very negatively about myself", "My emotions are usually well regulated and stable").Respondents are instructed to classify these statements as quickly and accurately as possible into relevant target categories (e.g., "impaired" versus "not impaired", respectively) while concurrently classifying objective self-statements (e.g., "I am doing a psychology experiment", "I am playing football on the grass") into the categories of "true" or "false".Despite unmistakable similarities with self-report measures, scores derived from the qIAT reflect an individual's automatic association between themselves and the concept being measured whereas self-report scores reflect the degree to which an individual thoughtfully attributes characteristics of a given trait, feeling, thought pattern, motive, behavior, level of functioning, or experience to themselves.That is, self-report scores quantify how the person views and (chooses to) presents themselves whereas qIAT scores would quantify how strongly the latent construct is connected to the individual's implicit self-conceptualization.
Like self-report tests, the qIAT is particularly well-suited for both research and clinical practice due to its brief administration (~ 5-10 min), simple administration and automatic scoring procedures, direct interpretability of scores (an important feature for clinical utility), and use of validated and standardized stimuli.It is for these reasons that the collection of studies reported herein aimed to develop and psychometrically evaluate a novel, performance-based measure of personality functioning impairment using the qIAT methodology.

Current study
The overarching goal of the reported series of studies was to develop the Level of Personality Functioning Scale-qIAT (LPFS-qIAT) and conduct an initial psychometric evaluation of this new instrument.Following development of the LPFS-qIAT (see Sect. 8 below for details), four studies were conducted to examine the LPFS-qIAT's internal and retest reliability and the convergent, discriminant, and criterion-related validity of this new instrument.

Participants
All results reported below were obtained using participants recruited through four separate online studies conducted over the course of approximately two years at a medium-sized public university in the Southern United States.Participation was restricted to students who were at least 18 years of age and English-speaking.All participants were required to complete the online studies on a device with a keyboard, which was necessary for administration of the IAT and qIAT measures.Participants were notified at the outset of their participation if they attempted a study on an incompatible device (e.g., touchscreen phone) and were asked to switch to a compatible device, at which point they were permitted to begin the study.No other restrictions were placed upon participation.
Similar procedures were used in each of the four studies: Following informed consent, participants were asked to complete a series of self-report and performance-based measures in a randomized order before completing an extended demographic questionnaire.For those studies including informant ratings and a second participation session (Samples 3 and 4), participants were asked to provide their own email address and the email addresses of two (Sample 3) or three (Sample 4) individuals who know them reasonably well.Informants were automatically emailed a study invitation immediately following a participant's provision of their email address.Participants were automatically emailed a Time 2 study invitation exactly one (Sample 3) or two (Sample 4) weeks following their survey completion at Time 1.Each study received approval from the Institutional Review Board at the authors' university and all participants received course credit for participating.The following describes the conventional data processing procedures that were followed for each study and descriptive statistics of each sample are reported in Table 1.Results of group comparisons between those who completed the full study and those who dropped out or were removed due to excessively fast responding on the IAT or qIAT can be found in the Supplementary Material of this paper.

Sample 1.
A total of 1043 participants completed the first study and endorsed taking part in the study seriously and honestly.Three participants did not complete the LPFS-qIAT and 234 participants were removed due to excessively fast responses (see "Data processing" section below), a drop rate within ranges commonly seen for anonymous online data collections [60].Thus, the final sample included 806 participants.

Level of personality functioning scale-qIAT (LPFS-qIAT; introduced here)
The qIAT can be viewed as an amalgamation of self-report and IAT procedures that enables indirect measurement of a latent factor.Development of the LPFS-qIAT made use of the self-related logical categories presented by Friedman et al. [59] and items from the LPFS-BF 2.0 [25].The qIAT requires opposing item sets; LPFS-BF 2.0 items reflect impaired -I often understand my own thoughts and feelings -I often make unrealistic demands of myself -I often make realistic demands of myself -I often have difficulty understanding the thoughts and feelings of others -I often understand the thoughts and feelings of others -I can't stand it when others have a different opinion than me -I can appreciate others' opinions, even when we disagree -I often don't understand the effect my behavior has on others -I often understand the effect my behavior has on others -My relationships and friendships usually don't last long -My relationships and friendships usually last long -I often feel uncomfortable when relationships become more intimate -I often feel comfortable when relationships become more intimate -I often have trouble cooperating with others -I often cooperate with others well personality functioning and a complementary set of items was created by reversing the meaning of the original LPFS-BF 2.0 items.The full list of LPFS-qIAT items is presented in Table 2.The LPFS-qIAT was built in R using the iatgen tool [60] with code modifications made to accommodate the qIAT method.The qIAT involves a procedure comparable to the IAT with minor differences.Specifically, participants complete multiple, brief double-categorization reaction-time tasks wherein a single statement (i.e., test item) is presented on the center of a computer screen along with category labels at the top right and top left corners of the screen.Participants are instructed to classify each statement using the presented category labels as quickly and as accurately as possible by pressing either the "E" or "I" key on their keyboard.The full procedure (see Table 3) consists of seven blocks, including two learning blocks (1, 2), three practice blocks (3,5,6), and two critical test blocks (4, 7).There are multiple strategies for handling incorrect responses (erroneous classifications), and the present project made use of two well-established methods for the purpose of generalizability: Sample 1 made use of a 600 ms error message with no response correction required and a forced error correction method was used with Samples 2, 3, and 4 [61].A copy of the LPFS-qIAT for administration via Qualtrics and scoring code are available at https:// osf.io/ 8bfka/?view_ only= c1d52 14c61 3547c 1a07a 0be54 8109d 36.Details on data processing, scoring, and psychometric properties of the LPFS-qIAT are reported in the Results section below.

Level of personality functioning scale-brief form 2.0 (LPFS-BF 2.0)
The LPFS-BF 2.0 [25] is a 12-item self-and informant-rating measure of self-functioning and interpersonal functioning, producing two corresponding scale scores and a total personality functioning impairment score.This measure uses a 4-point Likert scale, ranging from 1 (completely untrue) to 4 (completely true).Internal consistencies were reported to be satisfactory for the total scale, as well as the self and interpersonal functioning scales, and meaningful associations were reported between the LPFS-BF 2.0 and other measures of severity of PDs [25].The LPFS-BF 2.0's total personality functioning impairment scale demonstrated acceptable internal consistency in the present studies for each sample, with reliability coefficients ranging from α = 0.88 and ω = 0.88 (Sample 1) to α = 0.90 and ω = 0.90 (Sample 4).Reliability coefficients of the informant-rating version were α = 0.88 and ω = 0.88 in both Sample 3 and Sample 4. It should be noted that the LPFS-BF 2.0 was replaced with the Level of Personality Functioning Scale-Self-Report (LPFS-SR) [26] for Samples 2 and 3. Study time constraints prevented administration of multiple self-report tests of personality functioning impairment and the authors opined that use of a criterion measure that does not share items with the LPFS-qIAT would make for a more robust scrutiny of the LPFS-qIAT's validity.With respect to Sample 3 specifically, this replacement also allowed for a direct comparison of test-retest reliability results to previous external retest reliability findings (see [33], for a study on the retest reliability of the LPFS-SR).Due to the shorter length of the LPFS-BF 2.0, it replaced the LPFS-SR in the final study (Sample 4) to permit administration of a large assortment of measures used to examine the criterion and discriminant validity of the LPFS-qIAT.

Level of personality functioning scale-self report (LPFS-SR)
The LPFS-SR [26] is an 80-item self-report questionnaire designed to assess the severity of personality dysfunction by capturing the aspects of the LPFS in the AMPD.This measure uses a 4-point scale, ranging from totally false to very true.
The LPFS-SR was reported to have good internal consistency and positive associations with other measures of similar constructs [26], as well as strong test-retest reliability over multiple weeks (rs = 0.81-0.91)[34].The LPFS-SR total scale demonstrated acceptable internal consistency in each of the three studies including the LPFS-SR, with coefficients ranging from α = 0.94 and ω = 0.95 (Sample 3) to α = 0.95 and ω = 0.96 (Sample 2).Due to an already lengthy battery of tests being administered to Sample 4, the LPFS-SR was not included in the final study to avoid exhausting study participants.

DSM-5-TR self-rated level 1 cross-cutting symptom measure-adult (DSM-5-TR CC)
The DSM-5-TR CC [15] was developed to help clinicians assess all major areas of psychiatric functioning (e.g., mood, psychosis, cognition, personality, sleep) and identify additional areas for inquiry by revealing possible disorders, atypical presentations, subsyndromal conditions, and coexistent pathologies.The DSM-5-TR CC is endorsed by the American Psychiatric Association [15] as a necessary first step in identifying and addressing the heterogeneity of symptoms across

Personality inventory for DSM-5-brief form plus modified (PID5BF + M)
The PID5BF + M [62] is a 34-item self-report measure designed to assess the pathological trait domains described in Criterion B of the DSM-5's AMPD and the chapter on personality disorders and related traits in the ICD-11: Negative Affectivity, Detachment, Disinhibition, Antagonism/Dissociality, Anankastia, and Psychoticism.The PID5BF + M is a modified version of the PID5BF + [63], which was derived from the Personality Inventory for DSM-5 [64] via ant colony optimization and demonstrated acceptable model fit, good reliability, and criterion-related validity across multiple samples [63].The PID5BF + M was created to better capture the ICD-11 domain of anankastia and its three facets.The PID5BF + M has been shown to be a psychometrically sound instrument for assessing the six combined DSM-5 and ICD-11 personality trait domains [62].Internal consistency reliability for the six trait domain scales used in the present studies ranged from α = 0.76 and ω = 0.76 (Detachment, Sample 3) to α = 0.88 and ω = 0.88 (Anankastia, Samples 3 and 4).

World Health Organization's Quality of Life Brief Scale (WHOQOL-BREF)
The WHOQOL-BREF [66] is a 26-item self-report measure of four quality of life domains: Physical Health, Psychological, Social Relationships, and Environment.Derived from the WHOQOL-100, the WHOQOL-BREF was reported to correlate with WHOQOL-100 domain scores at approximately r = 0.90.Discriminant validity, content validity, and test-retest reliability were reported to be good for the WHOQOL-BREF [66].The present study demonstrated acceptable internal consistency of domain scales, with reliability coefficients ranging from α = 0.67 and ω = 0.69 (Social Relationships) to α = 0.85 and ω = 0.85 (Psychological).

Inventory of depression and anxiety symptoms-second version (IDAS-II)
The IDAS-II [67] is a 99-item self-report questionnaire composed of 19 scales designed to assess a broad range of depression, anxiety, and bipolar symptoms.This measure uses a 5-point scale ranging from not at all to extremely.The IDAS-II scales have shown good convergent, discriminant, and criterion validity [67].The present study demonstrated acceptable internal consistency with domain scale reliability coefficients of α = 0.80-0.91 and ω = 0.81-0.92.

Personality implicit association test-extraversion (extraversion IAT)
Using the methodology developed by Carpenter et al. [60], the extraversion IAT (see [68]) was administered via participants' personal computers through an online data-collection platform (Qualtrics).Stimuli from the categories of self (me, my, own, I, self) and others (they, your, them, you, others) and items from the categories of extraversion (sociable, talkative, active, impulsive, outgoing) and introversion (shy, withdrawn, passive, deliberate, reserved) were presented [68].
Testing consisted of seven blocks and took approximately five minutes to complete.Relative to IATs measuring other self-concept constructs (e.g., racial or political biases), personality trait IATs have routinely demonstrated better reliability and construct validity; extraversion and neuroticism IATs appear to be the most robust [68].Data processing procedures used for the Extraversion IAT are reported in the results section.

Social desirability-gamma short scale (KSE-G)
The KSE-G [69]  Erwünschtheit-Gamma) [70], the English-language adaptation has demonstrated reliability and validity coefficients comparable to those of the original German version, as well as metric measurement invariance [69].Convergent and discriminant validity of the KSE-G supports the instrument's construct validity, and estimates of internal consistency (α) have ranged between 0.65 and 0.72 for PQ + and between 0.64 and 0.79 for NQ− [69].The present study obtained approximately comparable levels of internal consistency for the PQ + (α = 0.58, ω = 0.58) and the NQ− (α = 0.65, ω = 0.65).

Indecisiveness Scale (IS)
The IS [71] is a 22-item self-report measure of general indecisiveness, which pertains to difficulties in making decisions regardless of whether they are of great or little significance.The IS is composed of 11 features, each of which include one positively formulated and one negatively formulated item.Psychometric study of the IS has found support for a unifactor structure, acceptable levels of test-retest reliability, and predictive validity in terms of various decision-making situations [71].The Present study demonstrated acceptable internal consistency (α = 0.92, ω = 0.92).

Statistical analyses
Following development of the LPFS-qIAT, four studies were conducted to examine the LPFS-qIAT's internal and retest reliability and the convergent, discriminant, and criterion-related validity of this new instrument.The LPFS-qIAT and Extraversion IAT data for all samples were processed using a D-score data cleaning and scoring algorithm (see [60]).First, participants with more than 10 percent of responses faster than 300 ms (i.e., excessively fast responders) were removed.Then, in Sample 1, participant response errors were addressed by replacing participants' response errors with their block means of correct trials plus 600 ms (i.e., the D 600 procedure).The built-in-error-penalty method, wherein participants are required to self-correct an error before proceeding, is a preferred method [61,72] and was used in Samples 2, 3, and 4. The use of different methods across studies was done for the purpose of generalizability.Finally, LPFS-qIAT and Extraversion IAT responses were D-scored, with positive scores indicating greater levels of personality functioning impairment or extraversion.Internal consistency reliability of the LPFS-qIAT was assessed using a variant of Cronbach's alpha [73], and Pearson correlations using a bootstrapping procedure with 5000 replicates were used to evaluate the instrument's test-retest reliability.Correlation coefficients and associated 95% confidence intervals based on 5000 bootstrap replicates were also calculated to evaluate the convergent, discriminant, and criterion-related validity of the LPFS-qIAT.In all cases, the statistical significance of an effect size was determined by examining the 95% confidence intervals computed around the estimated effect size and deemed significant at p < 0.05 when a value of zero (null) did not fall between the upper and lower intervals.

Data processing
For all samples, the LPFS-qIAT and Extraversion IAT data were processed using a D-score data cleaning and scoring algorithm (see [60]).The percent of dropped trials (trials > 10 s) was low, ranging from 0.0043% (Time 2 Sample 4) to 0.0057% (Time 1 Sample 4).The percentage of participants removed due to excessively fast responding, which was expected to be elevated in our student convenience samples, ranged from 22.7% (Sample 1) to 29.8% (Sample 2) and was within the ranges commonly seen for anonymous online data collections [60].As further detailed in the Supplementary Material, removed participants (i.e., excessively fast responders) were significantly younger and self-reported higher levels of psychopathology than retained participants in our two large samples (Samples 1 and 2).This pattern was not observed in Samples 3 or 4, nor were removed participants in Samples 3 and 4 found to differ from retained participants with respect to informant ratings.Participants in Samples 3 and 4 who completed Time 2 did not significantly differ from those who dropped out on any measure of personality functioning impairment, but drop-out participants did endorse significantly higher levels of disinhibition and psychoticism at Time 1.
Error rates on the LPFS-qIAT ranged from 11.4% (Sample 2) to 14.3% (Time 1 Sample 3) and were handled using two different methods across studies.For Sample 1, participants' response errors were replaced with their block means of correct trials plus 600 ms (i.e., the D 600 procedure).The built-in-error-penalty method, wherein participants are required

Research
Discover Mental Health (2024) 4:6 | https://doi.org/10.1007/s44192-024-00059-4 Table 4 Correlation coefficients depicting the convergent validity of the LPFS-qIAT with other measures of personality functioning impairment and the LPFS-qIAT's criterion-related validity Bold indicates statistical significance based on 95% confidence intervals from 5000 bootstrap replicates.Sample 1 personality traits scores were obtained using the PID5BF + [62] whereas trait scores for all other samples obtained using the PID5BF + M [61] a 412 informant ratings of 285 participants Measure

Discussion
Available measures of personality functioning impairment as operationalized in the AMPD and ICD-11 exclusively use self-report, clinician-/informant-rated, or interview methods and there is a problematic absence of performance-based instruments capable of measuring personality functioning impairment.The LPFS-qIAT takes a step in filling this gap as the first performance-based measure of personality functioning impairment that directly parallels the AMPD and, to a degree, the ICD-11.The LPFS-qIAT is brief, with completion taking 5-10 min, is easily administered in person or online, and can be automatically scored in a manner allowing simple interpretation.This initial psychometric evaluation of the LPFS-qIAT produced encouraging evidence across four separate studies.
The LPFS-qIAT's good internal reliability was repeatedly observed, demonstrating internal consistency for this specific instrument, and allowing confidence in its online administration.The latter is a notable strength of the LPFS-qIAT compared to most other performance-based measures of personality.Although poor by self-report standards, one-and two-week test-retest reliability coefficients were approaching those often seen for IAT measures (averaged r = 0.50) [47], but are at levels insufficient to justify interpreting a single individual's score.Convergent validity of the LPFS-qIAT was established using three different self-report tests of personality functioning impairment and one informantrated measure.The correlation coefficients obtained across the four studies were comparable in magnitude to those normally observed between performance-based (e.g., Rorschach, TAT) and self-report or informant-rating measures of analogous constructs (typically in the r = 0.20-0.30range) [53][54][55][56].Evidence of the LPFS-qIAT's criterion-related validity was demonstrated by its positive correlation with maladaptive personality traits, particularly Negative Affectivity, and inverse relationships with measures of psychological health and quality of life.Finally, discriminant validity of the LPFS-qIAT was exhibited through small, non-significant correlations with self-report measures of depression, anxiety, indecisiveness, socially desirable responding, and age as well as a negligible and non-significant association with an IAT measure of extraversion.Taken together, the LPFS-qIAT can be cautiously viewed as an internally reliable and valid measure of personality functioning impairment based on this initial, but rather extensive psychometric examination.The LPFS-qIAT's test-retest reliability emerged as a notable limitation that must be addressed in future studies.

Limitations, constraints on Generalizability, and future directions
Although encouraging, the present studies are not without their limitations.First, each of the reported studies were conducted using diverse college student convenience samples, limiting the generalizability of findings.The present samples are more diverse in race and ethnicity than common for college student samples, but the current findings might not generalize to a broader population.Future studies should build upon the somewhat promising findings reported herein through generalizability studies using more diverse samples, including both community and clinical populations.Similarly, differences between retained/continued and removed/dropout participants may also limit the generalizability of results and should be investigated in future studies.However, it is necessary to first address the limitations of the LPFS-qIAT's retest reliability.The poor retest reliability results were obtained via online studies using small convenience samples.Future research may wish to first investigate the LPFS-qIAT's test-retest reliability in larger, more reliable samples.Should reliability coefficients not meaningfully improve, researchers should explore tactics for increasing retest reliability to a level that would allow confidence.One prospective solution to this issue involves averaging results from a repeated administration of the LPFS-qIAT at a single timepoint, a strategy often successfully used in research and clinical practice for other measures (e.g., blood pressure) [74].Encouragingly, researchers using this strategy recently produced an IAT test-retest reliability of r = 0.89 over a period of 2 years [75].The current study also did not specifically evaluate the LPFS-qIAT's incremental validity over other methods, such as self-report measures, when predicting meaningful outcomes (e.g., actual problems in relationships or at work).Investigating this instrument's incremental validity in future studies is important.When doing so, researchers must be mindful of undesired method effects and not measure outcomes using the same method against which the LPFS-qIAT is being compared (e.g., outcomes measured by self-reports should not be used when the LPFS-qIAT's incremental validity over self-report measures of personality functioning is being tested).Lastly, the LPFS-qIAT possibly fails to account for all components of the ICD-11's conceptualization of personality functioning, which is represented by self-and interpersonal dysfunction in addition to emotional, cognitive, and behavioral manifestations of the personality dysfunction [13].Future studies will need to investigate the degree to which the LPFS-qIAT can capture all components of personality functioning impairment enumerated in the ICD-11.That is, future studies should test the LPFS-qIAT's construct validity using ICD-11-compliant criterion measures.Should the LPFS-qIAT demonstrate poor construct validity in those studies, researchers might choose to apply the methodology described in this paper to a self-report measure that directly aligns with the ICD-11, such as the PDS-ICD-11 [31], to develop a new performance-based instrument that more closely matches the ICD-11's conceptualization of personality functioning.Despite these limitations and need for further studies, development (and further validation) of a novel, easily administered and automatically scored performance-based measure of personality functioning impairment has several implications.
Professional practice guidelines advocate for the use of multimethod assessment [45,76,77].The availability of a performance-based measure of personality functioning impairment that is psychometrically promising, aligns with the AMPD (and partly with the ICD-11), and is easily administered, scored, and interpreted can prove valuable.Multimethod research designs minimize unwanted measurement effects, and the examination of convergences and divergences of test scores derived from multiple methods for measuring analogous constructs-intertest score relationships-can facilitate a more comprehensive and precise measurement of study variables.When similar strategies are applied in clinical practice, clinicians can gain a greater understanding of the relationship between a client's explicit and implicit personality dysfunction.Hence, the LPFS-qIAT, as the first performance-based measure of personality functioning aligning with contemporary models, holds potential to become a useful tool in both research and clinical practice by facilitating multimethod assessment.However, researchers must first address the LPFS-qIAT's poor retest reliability, investigate the generalizability of the present findings, and evaluate this instrument's incremental validity over other methods when predicting meaningful outcomes before this statement can be made more definitively.The LPFS-qIAT might also benefit research and clinical practice as a novel method for capturing changes in personality dysfunction.The LPFS-qIAT was derived from the LPFS-BF 2.0, and the latter has shown a high sensitivity to change in response to therapy [25].Should future studies find that the LPFS-qIAT can also detect therapy-related changes in personality functioning, the LPFS-qIAT could be a useful tool for therapists and in therapy outcome research-perhaps especially when paired with the LPFS-BF 2.0.Alternatively, scores derived from performance-based measures such as the LPFS-qIAT are largely based on unrehearsed performance, which could interfere with using the instrument in this way due to the potential for practice effects [78].The effect of repeated measurement using the LPFS-qIAT over the course of treatment, the potential measurement error due to practice effects, and the instrument's validity and utility for therapy outcome research could be important questions to explore once the other psychometric needs noted above have been met (e.g., improved retest reliability).Nevertheless, the findings of the reported studies offer encouragement for future research and continuing to improve and validate the LPFS-qIAT as a robust, performance-based measure of personality functioning.

Sample 2 .
A total of 708 students participated in the second study and endorsed taking part in the study seriously and honestly.The final sample included 497 participants after removing 211 participants due to excessively fast responding on the qIAT measure.Sample 3. A total of 287 participants completed Time 1 of the third study and 156 (54%) completed Time 2 and endorsed taking part in the studies seriously and honestly.An additional 78 participants from Time 1 and 41 participants from Time 2 were removed due to excessively fast responding.This resulted in final samples of 209 participants for Time 1 and 115 participants for Time 2. Usable data for both Time 1 and Time 2 was provided by 104 participants.Sample 4. A total of 320 participants completed Time 1 of the fourth study and 78 (24%) completed Time 2, and endorsed taking part in the studies seriously and honestly.Eighty-four participants from Time 1 and 22 participants from Time 2 were removed due to excessively fast responding.This resulted in final samples of 236 participants for Time 1 and 56 participants for Time 2. Forty-eight participants provided usable data for both Time 1 and Time 2.