FormalPara Key Points for Decision Makers

As preference elicitation methods are increasingly used in health services research, including HIV prevention, quality standards and criteria are needed to ensure use of accurate terminologies and appropriate methods and analytic tools.

To grow as a field, we need to build on each other’s work, including learning from already existing literature, synthesizing data instead of generating data for data’s sake, utilizing and implementing findings, and re-using existing instruments as appropriate rather then creating new tools.

Trends in use of particular preference elicitation methods, focus on particular outcomes and health topics, e.g., HIV prevention technologies, and populations/geographies, should be critically examined for their trends in popularity and funding sources.

1 Introduction

In the first decades of the human immunodeficiency virus (HIV) epidemic, biomedical options for prevention were scant, with condoms the only option for prevention of sexual transmission [1]. Other prevention options included post-exposure prophylaxis (PEP) for occupational exposure, promotion of testing, nevirapine for prevention of mother-to-child transmission (PMTCT), syringe exchange, and medication-assisted treatment (MAT) to reduce the risks of injection drug use and thus HIV risk [1, 2]. This menu of choices has expanded recently to include voluntary medical male circumcision (VMMC); pre-exposure prophylaxis (PrEP) pills, now in two formulations; and treatment as prevention [3,4,5,6]. Blood-based rapid HIV tests and oral fluid-based rapid HIV self-tests for use at home seek to expand the number of people aware of their HIV status, and thus constitute a key first step on the HIV prevention and treatment cascades [7, 8].

Several HIV prevention technologies are in development or testing. These include vaccines, topical microbicides, and multiple formulations or modalities of PrEP, such as injectables, vaginal rings, implants, and diaphragms [9, 10]. A trial of bimonthly long-acting injectable cabotegravir was recently stopped early, owing to its proven effectiveness compared to oral daily PrEP pills [11]. The future of HIV prevention may look more like the contraceptive marketplace, with a menu of options that individuals can choose from according to their risks, needs, and preferences [12].

Quantitative preference elicitation methods measure and provide insight into client preferences. These methods may then be used to optimize product design while still in development (i.e., gel consistency; oral swab vs finger prick), or inform implementation strategies or service design (dispensing location, cost, healthcare provider characteristics) for products already or soon to be on the market. These methods have been increasingly applied to health issues, including HIV/acquired immunodeficiency syndrome prevention and treatment [13] and health system priorities [14]. With such patient-centered data, products and services can potentially be adapted to better suit target audiences. For example, the HIV prevention needs of a seasonal female sex worker in Nairobi and of a gay man in Atlanta are disparate, even if both opt for PrEP. Human immunodeficiency virus prevention options that better meet different audiences’ needs may potentially improve uptake, adherence, and persistence (long-term use), ultimately reducing HIV incidence.

Given this growing landscape of HIV prevention technologies along with the increasing application of quantitative preference elicitation methods to HIV-related topics, this review summarizes the current literature and identifies methodological norms in this rich and growing literature. We also advance the critical appraisal of reporting on stated-preference experiments by administering both the PREFS checklist (purpose of study, respondent sampling, explanation of assessment methods, findings, significance testing) [15] as well as a customized Newcastle–Ottawa tool [16].

2 Methods

2.1 Inclusion and Exclusion Criteria

2.1.1 Populations

We included populations of any age or sex, or other stakeholders concerned with preventing HIV transmission. Preferences of people living with HIV were included if the target intervention was prevention, e.g., PMTCT, but otherwise excluded.

2.1.2 Interventions

We considered HIV prevention technologies either currently available or under development: PrEP in any form (tablet, ring, injectable); topical microbicides; cervical barriers and intravaginal devices; clean injecting equipment; MAT; voluntary testing and counseling; HIV self-testing; VMMC; PMTCT; PEP; and external (male) and internal (female) condoms. We excluded prevention technologies used for purposes other than prevention of HIV (condoms only as contraception), unless preferences were elicited for the product in combination with its HIV prevention use. We excluded preference elicitation concerning strictly behavioral interventions without the aid of a biomedical, biotechnological, or pharmaceutical intervention.

While HIV testing is not itself a prevention intervention, we included it in this review for two reasons. One, there are technological advances in HIV testing, particularly newer rapid tests and blood- and saliva-based self-test kits, which make them a popular subject of preference elicitation methods. Further, HIV testing is considered an essential part of the HIV prevention and treatment cascades, as an initial step in primary and secondary prevention [7, 8]. While HIV tests in and of themselves do not biomedically prevent HIV, as VMMC and PrEP do, testing is nevertheless an important part of the HIV prevention technology landscape.

2.1.3 Types of Studies

We included papers that used quantitative preference elicitation in their study design. We excluded papers that describe preferences through solely qualitative methods. The definition of quantitative preference elicitation methods (i.e., stated-preferences methods) followed Soekhai et al. [17], and included discrete-choice experiments (DCEs); conjoint analysis (CA); best–worst scaling (BWS) cases 1, 2, and 3; willingness to pay (WTP); willingness to accept, try, or participate (WTA); contingent valuation (CV); Q-methodology; control preferences scale; self-explicated methods; analytic hierarchy processes; standard gamble; time trade-off; person trade-off; test trade-off; starting known efficacy; constant sum scaling; repertory grid method; threshold technique; swing weighting; visual analog scale; point allocation; and outcome prioritization methods. Studies that used only Likert scales, rating, or ranking were excluded. We classified study types by how they were named by the studies’ authors, with some exceptions. First, if the study did not use any of the above terms (e.g., called it simply a “choice experiment”), the reviewers (SWB, JFPB, NC) discussed the description of the elicitation method and determined which it fit most closely. Second, the authors of a small number of studies labeled experiments CA, but the description of the methods indicated they used DCE methods. We reclassified these as DCEs [18].

2.1.4 Outcomes

The co-primary outcomes were (1) the prevention technology type investigated (e.g., PrEP); and (2) the preference elicitation method applied (e.g., DCE).

2.2 Search Strategy

With the assistance of an informaticist, we developed the search strategy and conducted searches in November 2017, with updates in March 2019 and May 2020. We developed search terms for PubMed, then adapted as relevant for Web of Science and CINAHL databases (see Appendix). We used a combination of controlled vocabulary and keywords for (1) human immunodeficiency virus, (2) individual prevention technologies, and (3) preference elicitation methods, with no restrictions on language, publication date, or geography. The PROSPERO registration is #CRD42018087027. We hand searched references from selected studies and review articles to identify any additional relevant studies.

2.3 Screening and Review

Data screening and review were conducted in Covidence (Veritas Health Innovation, Melbourne, VIC, Australia, We conducted dual screening of titles and abstracts (SWB, NLC); one or two affirmative votes resulted in a full-text review. Two readers (SWB, NLC) independently reviewed full texts to confirm inclusion. We excluded records that were only abstracts or the full text was otherwise unavailable. The two reviewers discussed disagreements to reach a consensus, with a third reviewer (JFPB) if necessary.

2.4 Data Abstraction

SWB abstracted the following information, with unclear information reviewed by another author (NLC or JFPB): author, year of publication, title, location, population, sample size, first author affiliation, funding source(s), quantitative preference elicitation method, HIV prevention technology, number of attributes, strategies for attribute identification, data collection tool, statistical analysis approaches, and summary of main findings.

2.5 Quality Assessment

NLC rated study quality using two quality assessment tools. (1) The PREFS checklist, an existing and widely used tool [19,20,21] to assess reporting on factors that may influence the validity of patient preference studies: purpose of study, respondent sampling, explanation of assessment methods, findings, significance testing, with one point each (score 0–5) [22]. (2) The Newcastle–Ottawa Scale (NOS): cross-sectional studies [16], here adapted to reflect quality considerations for patient preference studies (see Appendix). The NOS contains seven items (representativeness, sample size, non-respondents, exposure assessment, comparability, outcome assessment, and statistical tests). The NOS employs a rating system that uses both letters grades (A, B, C) and stars (*), where more stars indicate higher quality. For this review, we dropped the “outcome assessment” category because outcomes are always collected via self-report in patient preference studies; this is a feature of preference work rather than a flaw. Our adapted version replaced “exposure assessment” with “validity of preference elicitation,” (validated measurement tool used, or a clearly explained and rigorous instrument development process). We also replaced “comparability” with “heterogeneity” (whether preference differences had been explored across sub-groups or segments of respondents). The highest score was eight stars (following the original NOS, “validity” and “heterogeneity” could earn up to two stars each).

We also conducted an exploratory analysis to describe variations in study quality across publication date and by method of preference elicitation. We created binary PREFS and NOS scores based on a median split of each variable, studies with PREFS scores of 1, 2, and 3 were described as “low-quality PREFS” and studies with PREFS scores of 4 and 5 were recoded as “high-quality PREFS.” Studies with NOS scores lower than 6 were described as “low-quality NOS” and those with NOS scores of 6 or greater were described “high-quality NOS.” T-tests of binary PREFS and NOS variables and years since publication explored differences in study quality over time. Chi-squared tests explored differences in distribution of high- and low-quality NOS and PREFS scores for each of the methods (BWS, DCE, CA, CV, WTP, WTA). All analyses were conducted using Stata Version 16 (StataCorp. 2019. Stata Statistical Software: Release 16. College Station, TX: StataCorp LLC.).

3 Results

Searches yielded 6944 records, and 5022 titles and abstracts to screen after we removed duplicates. After a full text review of 318 articles, 84 manuscripts were eligible and included. The most common reason for exclusion at the full-text review was that the methods did not meet the definition of quantitative preference elicitation (k = 162). See Fig. 1 for the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) diagram. Study characteristics are found in Table 1. Note: k refers to the number of included studies, and n refers to the number of participants within the studies.

Fig. 1
figure 1

PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) diagram. HIV human immunodeficiency virus, quant. pref. quantitative preference

Table 1 Study characteristics and quality review

3.1 Study Descriptions and Methodologies

3.1.1 Preference Elicitation Methods

The most common preference elicitation methods used were DCE (28, 33%), CA (21, 25%), WTA (18, 21%), and WTP [including CV] (15, 18%). Only two used BWS [23, 24], and no other preference elicitation methods were found. The oldest article was from 1998; it used WTA and WTP direct elicitations to examine women’s preferences for vaginal antimicrobial (sexually transmitted infection [STI]/HIV) contraceptives [25]. There was a clear increase in the number of publications in the past decade, with 83% of the included studies published in 2010 or later; the median publication year was 2016. Of the 15 studies published in 2019 and 2020, 12 were DCE (see Fig. 2a, b).

Fig. 2
figure 2

Publications by year and preference elicitation method (a), and publications by year and human immunodeficiency virus prevention method (b). CV contingent valuation, PMTCT prevention of mother-to-child transmission, PreP pre-exposure prophylaxis, VMMC voluntary medical male circumcision

3.1.2 Attribute Identification

The range in the number of attributes was 1–11 (mean 4.14, median 5; excluding three studies where the number of attributes was unclear). Excluding studies that had only one attribute, i.e., WTP (26, 31%), and studies where the number of attributes was unclear (3), the mean number of attributes was 5.98 and median was 6. A Task Force on Good Research Practices for Preference Elicitation Experiments recommended a variety of strategies (literature review, qualitative formative work, consultation of expert opinion, pre-testing [e.g., face-to-face qualitative interviews or cognitive interviewing on the instrument], and pilot testing [administration of the instrument to a subset of the final sample]) to identify attributes and levels for preference elicitation experiments [26]. Many of the studies (36, 43%) did not report any of these (58% of those were investigating willingness). Only 42% clearly reported conducting literature reviews, 45% formative (usually qualitative) research, 24% expert consultations, 8% pre-testing, and 20% pilot testing (see Fig. 3).

Fig. 3
figure 3

Attribute identification strategies employed, recommended by ISPOR [26]

3.1.3 Data Analysis

The included studies reported a variety of methodological approaches and statistical analysis methods, and because the various elicitation methods require different analyses, we consider DCE, BWS, CA, and willingness studies separately in this section. Note that appropriateness of statistical tests and measurement of preference heterogeneity are included in the Quality Review section and Table 1. DCE

Of the 28 studies that employed DCE, several base statistical models were reported, including conditional logit (4, 14%) [27,28,29,30], multinomial logit (5, 18%) [31,32,33,34,35,36], ordinary least-squares regression (2, 7%) [37, 38], and logistic regression (1) [39]. Three (11%) reported relative importance scores [32, 40, 41]. Several studies also conducted mixed-logit (8, 29%) [33, 34, 42,43,44,45,46,47], nested-logit (3, 11%) [35, 36, 48], or random parameters logit (7, 25%) [32, 35, 49,50,51,52,53] (some studies reported more than one model). Most of the DCE studies (24/28, 86%) further measured preference heterogeneity using stratification/subgroup analyses and/or interaction terms (17, 61%), or latent class analysis (6, 21%) [29, 33, 40, 46, 51, 52]. One study used a blocked design with two different versions of the DCE for the two a priori sub-groups (past HIV testers vs non-testers) [47.] BWS

The two BWS studies used different types of analyses. Kim et al. reported relative and mean BWS scores, paired t-tests, and linear regression [23]. Newman et al. used multinomial logit and rank-exploded logit. Both also used stratification to measure preference heterogeneity. CA

Of the 21 CA studies, analysis methods included individual-specific, mean, and/or relative impact scores (7, 33%) [54,55,56,57,58,59,60]; mean ratings (5, 24%) [61,62,63,64,65], relative importance scores or interest values (4, 19%) [64,65,66,67]; modified Borda count method (1) [68]; part-worth utilities (4, 19%) [64,65,66, 69]; hierarchical Bayes estimation (2, 10%) [70, 71], and linear regression (1) [72]. To measure preference heterogeneity, two CA studies used subgroup analyses or stratification [69, 70]; and one used latent class analysis [66]. Willingness/CV Studies

There were 33 studies that measured WTP/WTA, including eight that used CV approaches to WTP. A variety of statistical approaches were used in these studies, most commonly logistic regression (13, 39%) [73,74,75,76,77,78,79,80,81,82,83,84,85] and Chi squared tests (including Pearson’s and Fisher’s exact) (12, 36%) [24, 25, 73, 75, 80, 83, 86,87,88,89,90,91]. T-tests (5, 15%) and analysis of variance (4, 12%) [24, 86, 88, 91] were also employed, as well as types of regression (linear [92, 93], count [94], log binomial [95], ordinary least squares [96], and logit link [97]). Three studies used Mann–Whitney U tests (9%) [81, 93, 98]. One study each reported using the area under a receiver operating characteristic curve [76], maximum likelihood double-bounded, dichotomous choice CV model [99], Wilcoxon signed-rank [100], Kruskal–Wallis [24], and Bernoulli response [97]. Two CV studies [94, 99] and one WTA [84] used stratification to measure preference heterogeneity.

3.1.4 HIV Prevention Technologies

The most common HIV prevention technologies were PrEP (19, 23%), voluntary testing and counseling (16, 19%), HIV self-testing (14, 17%), vaccines (13, 15%), and topical microbicides (8, 9%). Less commonly investigated were PMTCT (5, 6%), VMMC (4, 5%), and condoms (2, 2%). Two studies were classified as “other”: one was a needle-stick prevention device [74], and one had multiple prevention technologies (external and internal condoms, topical microbicides, VMMC, PrEP, and vaccine) [87]. The highest number of publications in one year was in 2019 (13, 15%), with five of those on PrEP and six on HIV testing. PrEP, PMTCT, and VMMC studies were all published in the past decade, and no vaccine preference studies were published after 2013. See below for a synthesis of the included attributes and main findings.

3.1.5 Populations

Included studies were conducted in 25 countries on five continents. Most were conducted in sub-Saharan Africa (SSA) (45, 54%) and North America (30, 36%). There were ten (12%) in Latin America and the Caribbean, ten (12%) in Asia (six of those in Thailand), and only six (7%) in Europe (UK and Ukraine only). The USA (27, 32%) and South Africa (14, 17%) had the greatest number of studies.

The total number of participants in all studies was n = 61,446, mean n = 768, median n = 374, and range 26–7176 (not double counting the same participants reported in more than one publication [37,38,39, 44, 45, 101]). The total number of participants in the DCE/BWS studies was n = 14,866, mean n = 495.5, median n = 347, and range 26–2033. The total total number of participants in the CA studies was n = 9482, mean n = 451.5, median n = 260, and range 27–1790. The total total number of participants in the WTA/WTP/CV studies was n = 34,241, mean n = 1037.6, median n = 460, and range 60–7176.

Of the participants, 68% were male (n = 41,562), 31% were female (n = 19,017 cisgender female, and n = 261 transgender female), and 1% of participants did not report sex/gender (n = 603). About half of the studies (45, 54%) included key populations at high risk for HIV (e.g., sex workers, men who have sex with men [MSM], clients of STI clinics); six (7%) included antenatal care clients, and two included both high-risk and general populations. The rest (29, 35%) were conducted among general populations, though 15 (52%) of those studies were in high HIV prevalence countries where segments of the general population are at elevated risk (e.g., young women in Kenya). One study was among healthcare workers (WTP for a needle-stick prevention device) [74], and one among informal caregivers of women living with HIV (WTP for PMTCT-related caregiving, and to be paid for caregiving) [100].

3.1.6 Survey Administration

Thirty-two studies (38%) were paper surveys, 27 (32%) computer-assisted self-interview/personal interviewing software or tablets, 11 (13%) online, and one telephone interview. The remaining studies (13, 15%) did not report this information. The majority (54, 64%) had facilitators present to either conduct the surveys or assist with computer-assisted self-interview/personal interviewing software, 20 (23%) did not, and 11 (13%) did not clearly state this information.

3.1.7 Author Affiliation and Funding

The majority of the studies were conducted by investigators at universities (71, 84%), with others conducted by non-governmental organizations and institutes (8, 9%), hospitals (3, 4%), industry (1), or government health programs (1). Funders of included studies were largely government research institutions [e.g., National Institutes of Health in the USA] (48, 57%) and government development agencies [e.g., Department for International Development in the UK] (13, 15%). Private foundations [e.g., Bill and Melinda Gates Foundation] (10, 12%), research institutes (5, 6%), intergovernmental organizations [e.g., United Nations] (7, 8%), and non-governmental organizations (4, 5%) comprised the remaining funders. Ten studies (12%) did not report funding.

3.2 Synthesis of Included Attributes, Levels, and Main Findings

Presented below is a qualitative synthesis of the attributes as well as the key findings for each HIV prevention modality. We grouped the attributes into three main categories: product design, service design (or program implementation), and willingness/acceptability. See Table 2 for a list of attributes and levels by HIV prevention technology. A handful of studies fell outside these categorizations; see below.

Table 2 Attributes and levels used by included studies in prevention technology

3.2.1 Pre-Exposure Prophylaxis

Nineteen studies examined PrEP [33, 34, 36, 40, 43, 46, 49, 51, 61, 64,65,66, 70, 71, 78, 79, 83, 90, 91, 101]. Product Design

The most common PrEP product design attributes were frequency of dosing (daily, monthly) and product formulation, which necessarily overlap (pills: daily; injections: monthly). Some studies found that dosing frequency was unimportant compared to other attributes [43, 61, 79], while other studies showed preferences for daily pills [33, 40, 65, 71], for monthly or bimonthly injections or implants [51, 70, 91], or for on-demand dosing (with sexual encounters) of rectal gels [64, 79]. Formulation (pills, injections, rings, gel) was found to be variously as important as efficacy [29], even more important than efficacy [64], or secondary to efficacy [91]. In studies that considered the vaginal ring, the ring rated poorly compared with other formulations, and was sometimes the least preferred [33, 34, 51, 64, 83]. Other product attributes included side effects, efficacy, additional protection from pregnancy or STI (pregnancy prevention was valued; STI prevention was valued among female sex workers [FSW]) [46, 49, 83]; and changes to menstruation [51].

Subgroup analyses showed heterogeneity in formulation preferences. Some MSM in the USA and Ukraine preferred injections, others on-demand pills, and others were uninterested in PrEP [40, 66]. In Uganda, while pills were overall preferred, women who used alcohol, experienced intimate partner violence in the past year, and did not make household decisions preferred implants and injections over pills [33]. Both general population women and FSW in South Africa preferred injections, while men cared little about formulation [46]. Service Design

Pre-exposure prophylaxis service design attributes most found were cost (US$0–250; out of pocket; insurance coverage) and location of dispensing site (HIV clinic, drop-in center, mobile clinic). Cost was less important than formulation to some [64], but it was the most important attribute in other studies [61, 65]. In subgroup analyses of MSM in the USA, cost was important in every group, while the importance of other attributes varied [40]. For dispensing sites, FSW in Malawi most valued family planning clinics or drop-in centers [43], FSW near the US-Mexico border preferred healthcare clinics vs non-governmental organizations [64], MSM in Thailand and the USA preferred pharmacies [40, 71], and people who used drugs in the USA preferred HIV clinics over drug treatment clinics [65]. Dispensing site was the most important attribute in African countries in a multinational study; dispensing at HIV treatment clinics was highly devalued; the authors hypothesize this was related to HIV stigma [70]. Frequency of HIV testing while using PrEP was examined in three studies; semi-annually was preferred over more frequent testing in all three studies [65, 70, 71]. Willingness

Two studies measured WTP for PrEP. Both were conducted in high-income countries (Canada, Scotland). Scottish men were willing to pay £50 monthly for PrEP, but a high proportion of Canadian respondents would only use it if covered by insurance [78, 90]. Risk Compensation

One study measured potential risk compensation during PrEP use among FSW in South Africa (they and/or their clients may be less willing to use condoms or accept/pay more for condomless sex). The authors found that PrEP use may negatively impact willingness to use condoms, and increase difficulties FSW already face in negotiating condom use [34].

3.2.2 Voluntary Testing and Counseling

Fifteen studies investigated combinations of voluntary testing and counseling product and service design [24, 28, 44, 45, 50, 53,54,55, 86, 88, 93, 95, 96, 99, 102]. Product Design

The most common product attribute was fluid sampling method (oral fluid swab, finger prick blood, venipuncture blood). Generally, oral fluid-based sampling was preferred, then blood-based sampling through a finger prick, with venipuncture in the arm the least preferred. In Tanzania samples, however, oral fluid-based sampling was highly disliked, with preference for finger, then venous, blood collection [44, 45]. One study found no significant difference in preferences for finger prick vs arm collection [55], and another found that participants were willing to accept a finger prick over an oral swab if it were cheaper, quicker, and also diagnosed other STIs [86]. One study considered STI testing along with HIV testing, with attributes of accuracy of syphilis results (potential for false positive) and the number of blood draws required (one or two). The number of blood draws mattered, but cost had the greatest impact [54]. Service Design

The most common attribute was cost (US$0–4 in lower income settings, US$ ≤ 30 in high-income settings); generally, participants preferred free testing [28, 50, 54, 55]. Other common attributes involved logistical issues: location (home, clinic) and distance to services [28, 44, 50, 55]; evening or weekend hours [44]; length of the visit [28]; and time to get results [50, 54, 55]. Testing at home and immediate access to results were preferred. A closer distance mattered greatly in Tanzania, where it was more important than confidentiality and sampling method [44]. Two studies assessed alternatives to expand HIV testing: providing testing at dental offices (high acceptability of an oral test, but only for free [88]); and distributing tests at community events [53]. Multiple studies compared men and women, seeking strategies to increase men’s access to HIV testing. These findings suggest more men may be reached with home-based, counselor-administered (vs nurse or self), immediate, and same-site access to antiretroviral therapy, bundling with other medical screenings (diabetes mellitus, tuberculosis), and providing monetary incentives [28, 45, 53]. Willingness

Six studies assessed WTP for testing, or to be paid to test (as an incentive), or to use a free HIV test. Willingness to pay ranged from nothing (willingness to be tested for free only), to medians of US$2–8 (lower income countries), and US$25. Two studies assessed WTA testing in exchange for a payment. In the Tanzanian general population, the mean estimate was about US$3, though there were differences by sex (men were willing to accept testing only at higher payments compared to women) and risk characteristics [99]. In Uganda, being paid to get tested influenced men, but not women, though both groups wanted free testing [28].

3.2.3 HIV Self-Testing

Fifteen studies examined HIV self-testing [29, 30, 32, 35, 41, 47, 68, 75, 77, 80, 82, 84, 89, 97, 98]. Product Design

The most common product design attribute was sampling method (oral vs blood). The preferences were mixed, with some preferring oral sampling, and others finger prick blood sampling [30, 41, 47, 86]. Accuracy was also considered (99% preferred over 95%) [29, 84], the length of the “window period” when one would get a false negative [29, 77, 98], and combining STI testing in the same test cartridge [86]. Service Design

The most common service design attribute was location (facility, mobile, home) [29, 30, 32, 35, 41, 47, 75, 84]. Findings were mixed, with some preferring home-based testing and others clinic-based testing. Provider type (self vs community health worker) and characteristics (age, from the same or different village) [30, 32, 41, 89], and pre- and post-test counseling or support (by phone, leaflet, in person) also mattered, with some preferring tests conducted by or with providers or counselors, others comfortable with testing alone, and others indifferent [30, 41, 47, 68, 77]. Cost mattered, with the strongest preference for free test kits [29, 30, 32, 41, 47, 75, 86]. In sub-group analyses, never-testers preferred oral over finger prick tests, while ever-testers were indifferent to type [30]. A study in MSM (UK) found two main classes of potential self-testers: the majority were comfortable with testing, and preferred provider-based testing; the minority, who had riskier sexual behaviors, preferred self-testing, and for free [29]. Willingness

An additional seven studies investigated willingness/acceptability of self-testing. These studies found high willingness to use a self-test, including if offered at a dental office, mailed home, or when given a monetary incentive (any amount). Two studies examined subgroups, finding that both ever-testers and never-testers were willing to use self-tests, and that men, ever-testers, and people with higher incomes were more willing to use self-tests. One study examined WTP by group, finding that HIV testing clients were willing to pay more (US$8.10) than MSM (US$6.50) and FSW (US$4.80). Dental clients were willing to pay US$10–15, but not more [89]. Two studies also investigated if self-testers would return results to providers (only about half were willing, in one study) [80].

3.2.4 Vaccine

Thirteen studies investigated vaccines; nine were CAs on product design and design of hypothetical or actual randomized controlled trials [31, 56,57,58,59,60, 63, 72, 103], and four on willingness to participate in those trials or pay for a future vaccine [73, 76, 94, 104]. Product Design

Efficacy was in eight of nine studies and was the most important attribute; high efficacy (95–99%) was always in the highest rated vaccine profile [31, 57,58,59,60, 63, 72, 94, 103]. Side effects was a common attribute; lack of side effects was rated highly across studies [31, 56,57,58,59,60, 63, 94, 103]. Several included the risk of vaccine-induced seropositivity (VISP) [false-positive result due to the vaccine], or length of time of VISP (months, years); not experiencing VISP was highly rated [31, 56, 59, 103], and was more important than efficacy in one study [59]. One study assessed the risk of vaccine-induced HIV infection (true positive result), and this was the most important attribute in that study [56]. Duration of protection was also common (10 years vs lifetime; longer preferred) [31, 58,59,60, 63, 94, 103]; number of doses (one to four; fewer weakly preferred) [56,57,58, 60, 63, 103]; and formulation (oral vs injection; not significant) [57, 58, 60, 63]. Trial and Service Design

Studies investigated location of administration [31, 59], proportion of population already vaccinated [31, 59], duration of trial [56], and reimbursement amount for trial participation, but these were not strongly rated either way [56]. Availability of free treatment if seroconversion occurs because of trial participation was important, but less so than vaccine-induced infection and VISP [56]. A few studies also assessed preferences around single or multiple strains (clades) protected against in a vaccine (results mixed) [57, 58, 63, 103], and the cost of a future vaccine (price mattered less compared to other attributes) [31, 57,58,59,60, 63, 72, 103]. Willingness

Four studies investigated willingness to participate in a vaccine trial or to pay for a vaccine [73, 76, 94, 104]. One study asked subjects on their willingness to participate at four timepoints leading up to an actual trial; greater stated WTP predicted later enrollment [76]. In two CV studies, respondents in Mexico were willing to pay around US$4670 for a vaccine with lifetime protection, and respondents in Thailand were willing to pay US$610–809 depending on effectiveness (50% vs 95%) [94, 104].

3.2.5 Topical Microbicides

Nine studies investigated vaginal or rectal topical microbicides [25, 36, 62, 67, 69, 85, 105,106,107]. (One was on intravaginal devices, and included microbicides [36].) Product Design

Attributes included efficacy [36, 62, 105, 106]; side effects [25, 62, 69, 106]; frequency of use/dosing [105, 106]; additional prevention from pregnancy and/or STI [36, 67, 69]; and potential for partner awareness of use (e.g., leaves residue on partner) [25, 36, 67]. Additional attributes were wait time between application to coitus (minutes, hours) [67, 106, 107], using an applicator (yes vs no, reusable vs disposable) [25, 106], and formulation (including shape, color, consistency) [25, 36, 62, 67, 85, 105, 107]. Summarizing across studies, participants would prefer a highly efficacious product that used an applicator, did not require daily use, had no side effects, had shorter wait times after application, prevented pregnancy, and could be used covertly from partners. A highly effective product was important to participants, but there was some willingness to trade lower efficacy for other desirable attributes (e.g., no side effects). Service Design

The only service-oriented attributes for microbicides were cost (US$0.30–7.60 per dose) [25, 36, 62, 105] and requiring a prescription vs over-the-counter availability [62, 105, 106]. Participants generally preferred a lower cost per dose (US$0.30), but one study found a greater WTP for high effectiveness, and a moderate amount more for a gel over a suppository and for intermittent vs daily use [105]. Findings on prescription vs over-the-counter were mixed across studies.

3.2.6 VMMC

Four studies investigated VMMC; three in South Africa were on different aspects of the same program, and the other was in Tanzania [37,38,39, 52]. None investigated product design (for example, PrePex device), but rather investigated service design [37, 38, 52] and motivations for accessing VMMC and risk compensation (“condom avoidance”) [39]. Service Design

Attributes in both the Tanzanian and South African studies were provider characteristics (e.g., sex), waiting area characteristics (e.g., separation of younger boys and grown men), and HIV testing requirements (opt-out, opt-in) [37, 52]. In South Africa, follow-up visit, low infection rate, less post-operative pain, and outside waiting areas were important [37]. One South African study evaluated preferences for promotion of a VMMC program. Participants preferred television marketing, support from church/school leaders, and endorsement by a national program [38]. In Tanzania, men preferred separate waiting areas for age groups, opt-out HIV testing, male-only service providers, counseling for female partners, and a monetary incentive in the form of a transport voucher vs a lottery [52]. Risk Compensation

Last, one of the studies in South Africa investigated if risk reduction (“reduces cervical cancer risk in women; reduces STI risk, reduces HIV risk”); risk compensation (“no condom needed”); or other features of VMMC “(improves hygiene, more enjoyable sex”) were motivators for accessing the service. This study found that risk reduction effects were the strongest, with no evidence of being motivated by condom avoidance among Black and Coloured populations, but evidence of this among White South Africans [39].

3.2.7 PMTCT

There were five studies that considered PMTCT; two of these evaluated product and service design attributes, one willingness. Two investigated motivations to seek PMTCT services [23, 27, 42, 48, 100]. Product Design

One study investigated product design of PMTCT, including aspects of safer conception (which can be technology assisted). This considered both serodiscordant and seroconcordant couples’ preferences for risk of vertical transmission (to the child) and horizontal transmission (to the partner), conception method, and chance of conception per cycle. Serodiscordant couples (female HIV positive, male negative) expressed decreased utility with risk of any transmission, and preferred home insemination with a syringe to other conception methods. Seroconcordant couples expressed decreased utility with risk of vertical transmission; they preferred clinic-based conception methods over other methods (e.g., condomless sex timed with ovulation) [48]. Service Design

Two studies considered cost, which was important to participants in both [23, 42]. Trust in provider [23] and respectful attitude of provider were strongly preferred attributes, with provider attitude two to three times more important than other attributes [42]. Co-location of other healthcare services was also important, but facility type did not influence participants [42]. Willingness

One study investigated willingness to accept or be paid for informal care around PMTCT in Malawi. Willingness to pay for such services was US$13, while willingness to accept payment to perform such services was US$30 [100]. Motivators/Values

Two studies evaluated trade-offs of motivations or priorities placed on preventing HIV in participants’ partners and/or children. These included statements such as antiretroviral therapy “keeps me healthy for my family,” “prevents me form getting sick from infections,” and “prevents infant’s illness.” Both studies compared participants in antepartum and postpartum periods, finding slightly different priorities. Living a long life was more highly rated during the antepartum period, while keeping CD4 count high was more important postpartum. However, trust in providers was highly prioritized in both periods [23, 27].

3.2.8 Other/Multiple Technologies

Four studies fell outside of HIV prevention technology categories summarized above (not in Table 2). One investigated WTP for condoms. It found that people who access free condoms in in East and Southern Africa were willing to pay for them, even more than the average cost in their countries, and that brand mattered [92]. Another found that women in the USA were willing to try new types of condoms and lubricants if they increased their sexual pleasure [81]. The only study that considered healthcare workers’ prevention needs investigated WTP for a needle-stick prevention device [74]. Last, one study compared WTA various HIV prevention modalities (topical microbicides, VMMC, PrEP, vaccine, and external and internal condoms) pre- and post-intervention, an educational presentation on prevention options. External condoms were preferred both pre- and post-intervention, but PrEP, microbicides, VMMC, and internal condoms were more highly rated post-intervention [87].

3.3 Quality Assessment

The mean PREFS score was 3.47/5 (standard deviation: 0.81; range 2–5) [Table 1]. Nearly all studies (86%) included their preference imperative in their explanation of the purpose of the study (“purpose”). Only 14% of studies included an explanation of how the respondents varied from non-respondents (“respondents”). Almost all (99%) sufficiently explained the preference assessment method (“explanation”), 60% of studies met criteria for reporting their findings (“findings”) and 89% used appropriate statistical tests (“significance”).

The mean stars awarded for our adapted version of NOS was 4.5/8 (standard deviation = 2.1, range 2–8) (Table 1). For representativeness, 89% of studies received at least one star. For heterogeneity analysis, 82% of studies received a least one star, and 55% of studies received two stars. All the studies received at least one star for exposure assessment, and 43% earned two stars. Only 27% of studies justified the sample size. Furthermore, also captured by PREFS, only 14% of studies included an explanation of how the respondents varied from non-respondents and 89% used appropriate statistical tests.

Publication year was not significantly associated with either having a high- or low-quality PREFS (p = 0.68) or NOS score (p = 0.64). The proportion of high- and low-quality studies as measured using either PREFS or NOS did not vary among BWS, CA, CV, WTP, or WTA studies (all p > 0.05). Studies using DCE were more likely to have a high-quality PREFS score (p = 0.049) and NOS score (p = 0.001) as compared to other methods.

4 Discussion

This systematic review summarizes the application of quantitative preference elicitation techniques to measure preferences for HIV prevention technologies. With 84 included studies covering a range of methods and prevention technologies, this review provides a synthesis of how these methods are used and applied in HIV prevention, trends in attribute identification, survey development, and analysis, as well as findings for HIV prevention program implementers. Next, we present some key findings and situate these in the literature.

Our study overlaps with another recent systematic review by Humphrey et al. [13]. That systematic review assessed preference elicitation methods applied to healthcare workers’ preferences and priorities in SSA, and included 77 articles, more than half of which used DCEs [14]. The present study differs from that review in important ways. First, Humphrey et al. limited the preference methods to DCE, CA, and BWS, while the present study had broader inclusion of elicitation methods. Second, our study specifically targeted HIV prevention technologies (e.g., PrEP, self-testing devices, vaccines), while Humphrey et al.’s review included any HIV-related topic, including prevention, service delivery, and antiretroviral therapy for those living with HIV. Thus, while several studies were included in both reviews (18, 21% of the 84 in this review), each review presents largely unique findings.

4.1 Preference Elicitation Methods

From the list of 22 eligible preference elicitation techniques, only six types were employed in the included studies (CA, DCE, WTA, CV, WTP, and BWS). Three techniques (CA, DCE, and WTA) accounted for nearly 80% of the included studies. There may be trends in the use of these techniques over time. For example, CA appeared only after 2005, 95% of DCEs were conducted after 2013, and 90% of the most recent ten studies used DCEs.

Classifying studies needed careful consideration and discussion between authors. For example, one study did not use any of the pre-determined labels, rather the term “choice elicitation tasks” [105]; for the purposes of this review, we labeled this BSW (case 3). Further, we re-classified some studies that were published as CA. Upon examination of their methods, we labeled them as DCE for this review, given their use of choice tasks comparing two or more profiles [27, 37,38,39,40]. “True” CA studies involved rating or ranking of profiles individually, rather than in head-to-head comparisons. Conjoint analysis has been used as an umbrella term when referring to DCE or other choice-based methods, but they are distinct methods with disparate underlying theoretical and statistical models [18].

Best–worst scaling techniques were employed in only two papers [23, 105]. The lack of BWS in the included studies may be because BWS is commonly used to understand priorities, while this review focused on trade-offs for specific technologies. Although DCEs can be used to generate some information about the relative importance of attributes, large variation in the calculation and context of attribute importance for DCEs complicates the interpretation of these data [108]. Indeed, the two BWS studies did focus on priorities about a prevention technology (PMTCT) rather than product or service characteristics per se. For most of the included studies, the objective was not to do exploratory work understanding priorities but rather to quantify preferences for hypothetical or actual prevention modalities’ characteristics. A takeaway message is that the field within health and medicine is growing, but investigators are not yet utilizing all the possible preference elicitation tools available. For example, it may be that CA gained rather than lost popularity in HIV preference research, and the use of DCE is growing, but other preference elicitation tools such as BWS and direct elicitation approaches may be useful and less complicated for both respondents and researchers to design and analyze [20].

An interesting finding of the current study was the relatively low utilization of rigorous and largely qualitative survey development/attribute identification methods. Fewer than half of studies reported using any of the established methods of survey development including literature review, qualitative/formative interviews, expert consultation, pre-testing, and pilot testing [26]. There is growing pressure in the patient preference community to develop and apply standards to inform the qualitative development of patient preference studies [109]. We would add that in addition to using qualitative methods to inform the development of such studies, authors should also be transparent in their reporting of the survey development methods used, and that this should perhaps be an additional item for a future quality assessment tool in preference studies.

4.2 HIV Prevention Technologies

This review included HIV prevention technologies that were either currently available or in various stages of pre-market development. While a broad range of technologies were eligible for inclusion, the included studies measured preferences for PrEP, HIV testing, vaccines, topical microbicides, VMMC, PMTCT, and condoms. The hypothetical nature of the product varied and the included studies measured preferences for technologies already readily available (condoms); available in some countries but not necessarily where the study was conducted (oral PrEP, HIV self-testing); available in the study location (VMMC, oral PrEP); or in the research pipeline (injectable PrEP, vaccine). With such diversity, there were also different types of attributes that the participants had to consider. We grouped these broadly under product design (e.g., side effects, modality, shape) and service design (e.g., cost, distance, provider type), with some studies examining a combination of these. This synthesis may be useful to both preference researchers designing preference surveys and HIV prevention implementers seeking to design programs for maximum uptake.

The HIV technologies investigated in the included studies were largely focused on PrEP and HIV testing, perhaps reflecting trends in HIV funding and research priorities [110]. Pre-exposure prophylaxis, for example, in its many modalities, such as the oral pill, long-acting injectable, intravaginal device, or topical microbicide gel, accounted for one in three of the studies. Pre-exposure prophylaxis is the newest of the technologies, and the first biomedical intervention for sexual prevention of HIV since condoms. Despite being approved for use in the USA since 2012, uptake has been low, and persistence lower still [111]. Alternative formulations and modalities of PrEP hold promise for overcoming some of the challenges with PrEP as a daily pill, such as adherence and persistence [112, 113]. Preference elicitation methods are well suited to understanding user preferences for PrEP, and these findings could be used to inform tailored interventions.

The included studies indicated heterogeneity in PrEP modality preferences, with some favoring the daily pill, others on-demand gel, and others injectable forms. This is in line with findings from multi-country studies of actual uses of various PrEP products [114]. The vaginal ring was not preferred in any of the studies that investigated it compared to other formulations. However, in studies where women tried various products over months, including the ring, there was comfortability and acceptability with the ring [114, 115]. This suggests that users may need to try and gain familiarity with such novel products, rather than rate them hypothetically.

Traditional testing and HIV self-testing accounted for another third of included studies. This reflects (1) user preferences and acceptability around testing through blood or saliva (findings mixed, with some preferring finger prick blood and some the saliva), and (2) alternative implementation strategies to increase testing, particularly among men in SSA (e.g., home and community-based testing; evening and weekend hours). Men in SSA have been less likely to access HIV-related services, are diagnosed later, and more likely to die even on antiretroviral therapy, compared to women [116, 117]. Thus, it is important to identify strategies that better meet their needs, as well as the needs of other populations at high risk.

Articles on preferences for future vaccines and actual and hypothetical vaccine trials were moderately common, though none were published since 2013. Two HIV vaccine trials are ongoing, and one was stopped early 2020, showing no preventive effect [118]. Prevention of mother-to-child transmission, VMMC, and condoms accounted for most of the remaining third, all aimed at understanding user preferences of existing services to adapt services in hopes of boosting uptake. Only one study focused on HIV prevention among healthcare workers, investigating WTP for a needle-stick prevention device. No studies investigated PEP or substance use harm reduction technologies (syringe exchange, MAT).

In the included studies, about two-thirds of participants were men and one-third were women. Gay and bisexual men and other MSM are at an elevated risk of HIV globally [119], and many of the included studies were among these groups. Many of the included women were young and adolescent women, who are at a particularly high risk in some settings (Eastern and Southern Africa), and many others were engaged in sex work, another highly vulnerable group, depending on context [120]. A tiny proportion (0.4%) of the included participants were transgender. This is despite the fact that transgender women who have sex with men have a 49 times higher odds of living with HIV than other adults of reproductive age, globally [121]. No studies reported including transgender MSM, another at-risk, but largely invisible, group [122, 123], nor other gender-diverse identities (i.e., non-binary people).

People who use drugs were included in only six studies [57, 58, 65, 70, 76, 93], and no drug use-related HIV prevention technologies were examined (e.g., syringe exchange). Multiple studies were conducted among general populations not at risk of HIV. Investigators should recruit from the populations most likely to use and benefit from the technology or service, should it become available, to uphold the principle of justice in human subject research and ensure the findings are relevant to people most in need of HIV prevention. There was limited geographic representation in the included studies. Nearly 90% were in North America and SSA, largely the USA and South Africa. No studies were conducted in Australia, Oceania, or Middle East/North Africa, and few throughout Latin America and the Caribbean, Asia, and Europe. This may be reflective of the researchers’ and funders’ priorities, as most funding was also from US funding institutions. The USA funds a large amount of HIV-related research and programming in SSA through PEPFAR, given the high HIV burden [124].

4.3 Quality Review

The high endorsement we observed across many of the quality metrics used in this study (e.g. 90% representativeness, 90% significance) indicates a need for quality assessment tools that can better distinguish high- from low-quality studies in preference contexts. That only 14% of studies included an appropriate assessment of how study respondents compared to non-respondents is a significant concern to the external validity of these studies. Though the PREFS checklist is a critical appraisal tool for use in the evaluation of stated-preference surveys [22], going into this review we were aware of existing shortcomings of the checklist. These shortcomings include a lack of discrimination across studies and the absence of important quality criteria that have been refined by the US Food and Drug Administration, including patient centeredness and heterogeneity [125]. Adding the NOS measures strengthened the quality appraisal by allowing for the addition of domains such as sample size justification and representativeness of the sample. However, other important criteria are still missing from preference quality appraisal tools, such as reporting on potential scale differences between samples, patient centricity and relevance of the preference-elicitation questions, and rigor of the instrument development processes.

4.3.1 Limitations

This review has limitations. There is not a definitive list of quantitative preference elicitation methods and we would expect variation in the return of results based on the methods we included. For instance, ranking and rating methods were omitted here as they were not explicitly included in Soekhai and colleagues’ list [17]. However, rating and ranking are included in other prominent lists for preferences, including the Medical Device Innovation Consortium Appendix on preference methods [126]. Including those methods would have likely included more articles, and perhaps a wider variety of HIV prevention methods. Given this review’s focus on quantitative preference elicitation methods, qualitative studies that examined HIV prevention preferences were excluded. Such studies nevertheless provide valuable information about users’ preferences and a more in-depth understanding of facilitators and barriers to product use but were beyond the scope of this review. This review also only included peer-reviewed literature. Thus, we may have missed studies that were programmatically focused and only in the gray literature, and may have a publication bias. The quality review of this study used the pre-exiting NOS for cross-sectional studies, adapting it for relevance to preference studies. However, this adaptation has not been validated and warrants further exploration of its usefulness.

This review summarizes the quantitative preference elicitation methods, attribute identification approaches, and statistical analyses used to measure preferences for HIV prevention technologies globally. While this literature illuminates preferences and may guide product development and adaptation of services, the extent to which such preference studies impact actual services and products is not well known and is an area for future research. Additionally, future work should continue to develop more rigorous quality assessment tools for use in reporting on stated-preference studies.

5 Conclusions

Human immunodeficiency virus prevention options continue to expand to offer a “menu” of biomedical options that can better meet the needs of disparate communities. However, these technologies only work if people at risk for HIV utilize them during periods of risk; and we know from low levels of condom use that simply because a method works, does not mean people will use it. Patient-centered preference elicitation methods can gather information about product and service characteristics that may then be used to influence uptake of HIV prevention methods. This review synthesizes extant literature on these topics and can provide practictioners with important information about preferences that may be used to improve prevention products and services. These methods, packaged together in combination prevention interventions [3] that are tailored to different community needs, will ultimately reduce HIV incidence.