Journal of Inherited Metabolic Disease

, Volume 35, Issue 1, pp 169–176

Putting a value on the avoidance of false positive results when screening for inherited metabolic disease in the newborn

Authors

    • School of Health and Related ResearchUniversity of Sheffield
  • Phil Shackley
    • School of Health and Related ResearchUniversity of Sheffield
  • Jim Bonham
    • Sheffield Children’s Hospital NHS Foundation Trust
  • Rachel Ibbotson
    • Centre for Health and Social Care ResearchSheffield Hallam University
Original Article

DOI: 10.1007/s10545-011-9354-0

Cite this article as:
Dixon, S., Shackley, P., Bonham, J. et al. J Inherit Metab Dis (2012) 35: 169. doi:10.1007/s10545-011-9354-0

Abstract

Despite the increase in the number of inherited metabolic diseases that can be detected at birth using a single dried blood spot sample, the impact of false positive results on parents remains a concern. We used an economic approach - the contingent valuation method – which asks parents to give their maximum willingness to pay for an extension in a screening programme and the degree to which the potential for false positive results diminishes their valuations. 160 parents of a child or children under the age of 16 years were surveyed and given descriptions of the current screening programme in the UK, an extended programme and an extended programme with no false positives. 148 (92.5%) respondents said they would accept the screen for the five extra conditions in an expanded screening programme whilst 10 (6.3%) said they would not and two were unsure. When asked to indicate if they would choose to be screened under an expanded screening programme with no false positive results, 152 (95%) said they would, five (3.1%) said they would not, two were unsure, and there was one non-response. 151 (94.4%) said they preferred the hypothetical test with no false-positives. The mean willingness to pay for the expanded programme was £178 compared to £219 for the hypothetical expanded programme without false positives (p > 0.05). The results suggest that there is widespread parental support for extended screening in the UK and that the number of false-positives is a relatively small issue.

Introduction

With the development of multiplex technologies such as Tandem mass spectrometry in the past two decades, the number of inherited metabolic diseases (IMDs) that can be detected at birth using a single dried blood spot sample has increased. This has led to many countries adopting screening tests for many of these disorders. The United States, for example, currently routinely screens for more than 50 diseases (Therrell and Adams 2007), and several European countries have recently extended their screening programmes to incorporate tests for more diseases (Bodamer et al. 2007). However, resistance to extended screening is apparent in several countries, including the United Kingdom (UK), where testing for five diseases remains the norm.

One of the main objections to the implementation of all screening programmes is the possibility of generating false positive results. When screening for metabolic disease the time interval between the screening result and the report of the confirmatory test is typically limited to one to two weeks, nevertheless, parental anxiety can be quite pronounced during this short period and this can have longer term effects. Several studies in the United States (US) have shown residual parental stress to increase by clinically significant amounts for several months following a false positive report (Waisbren et al. 2003; Hewlett and Waisbren 2006).

Whilst these results highlight that parental anxiety is an important consideration, this work has not allowed these adverse effects to be compared with the value and benefits of the intervention to parents, for example, reassurance. Furthermore, when considering the overall cost-effectiveness of any such expansion, it is difficult to factor in comparative anxiety measures into the assessment.

However, economists have developed techniques for valuing the benefits of health care interventions, that allow an overall measure of well-being to be assessed, and which can be used directly within economic evaluations. One such technique is the contingent valuation method (CVM) which measures the overall value to an individual by assessing their maximum willingness to pay for the intervention.

In essence, CVM uses personal expenditure as a measure of intensity of preference – the more we are willing to pay, the more we value the good that we are paying for (all other things being equal). Within a CVM survey, an ‘economic good’ is described (or in this case the proposed extension to the current screening programme) and respondents are then asked how much they would be willing to pay to access the proposed extension. Their value should incorporate all aspects of the extension that the respondent deems important, including possible stress associated with false positives.

We report here a CVM study which has been undertaken in the UK. It was designed so that the value of extended screening to include five additional inherited disorders could be measured, together with the value for the removal of false positives. As such, the study directly measures the benefits of the extension and the adverse effects to the parents. These values could then be set against the costs of the programme to see whether benefits exceed costs.

More specifically, the proposed study answers the question “What value do parents put on the proposed extension to the neonatal screening programme, what are their reasons for their valuation, and what comparative importance is given to the potential for false positives?” Whilst the research question is not targeted at any specific tests/extensions, the context used within the study relates to the current provision of neonatal screening in the UK and the proposed extension. Currently within the UK, neonates are tested for phenylketonuria (PKU), congenital hypothyroidism, cystic fibrosis, medium chain acyl coenzyme A dehydrogenase deficiency and sickle cell disease, whilst an extension to include maple syrup urine disease, isovaleric acidaemia (IVA), pyridoxine unresponsive homocystinuria, long chain hydroxyl acyl coenzyme A dehydrogenase deficiency and glutaricaciduria type 1 has been proposed. In addition, due to policy makers concern over false positives, we endeavoured to collect data on the reasons for participants refusing the extended screening programme.

Methods

The study is based on a survey of members of the general population. The survey aimed to interview 160 parents in the South Yorkshire region of the UK. Participants were sampled from socio-economically diverse areas of Sheffield in order to procure a sample that is broadly representative of parents of school-aged children. Whilst the process did not involve random sampling, previous studies of this approach within the same geographical area have been successful in gaining broadly representative samples (Shackley and Dixon 2000; Dixon and Shackley 2003).

Participants

Eligible participants were parents of a child or children under the age of 16 years and who were able to read and speak English to a degree that allowed completion of the interview. This was assessed by the potential participant after a description of the interview by the interviewer. Participants needed to be aged 18 or over.

Recruitment and consent

Potential participants were approached in their own homes in the days following the delivery of a courtesy letter informing residents in the area that interviewers would be visiting houses in forthcoming days. The person answering the door was asked whether the parent of a child under the age of 16 was present within the home and whether they would be willing to talk to the interviewer about their possible involvement in a study. Once the potential participant was present, the interviewer described the study topic, the eligibility criteria and the nature of the interview. This was supported by an information sheet that the participant was able to keep regardless of their response, and which includes contact details of the investigator.

The participant was then asked whether they were willing to participate within the study. Their verbal response was interpreted as their consent or otherwise to start the interview. The use of verbal consent was accepted by the Departmental Research Ethics Committee as sufficient for a low risk study of this nature.

Interview structure

The interview was in five sections. In section 1, a description was given of the current screening programme in terms of the conditions that are tested for, the likely impact on the child if positive for the identified conditions but not tested, the likely impact on the child if positive for the identified conditions and identified by the screen, and how common the conditions are. The impact of the false positive was described as, “Unfortunately, the screening test is not perfect and sometimes what first appears to be a positive result is actually negative. Because of this, all positive screening results are tested again using a more accurate test to identify whether they really are positive (known as a ‘true positive’) or actually negative (known as a ‘false positive’). The time from parents being told their screening result is positive to the results of the more accurate test being available is usually between 1 and 2 weeks”. The description was handed to the participant on a laminated A4 card and read out to them by the interviewer. On the reverse of the card was a table summarising the information (see columns 1 and 2 of Table 1 for the description of PKU testing used on the card). The participant was then asked to imagine that they were in a situation where they were offered screening, and whether they would choose to have it. Reasons for their response were collected as open text. The current programme was used to engage respondents with the topic and the parameters used to describe the different disorders and test characteristics within the interview. They were not asked their WTP for the current programme.
Table 1

Extracts of descriptions relating to individual tests within the current, anticipated and zero false negative screening programmes

Aspect of the scheme

Extract from description of current scheme

Extract from description of extended scheme

Extract from description of extended scheme with zero false positives

Condition

PKU

IVA

IVA

Expected cases per 700,000 live births each year in England and Wales

58

7 to 12

7 to 12

Expected number of false positive cases

10

10

0 rather than 10

Outcomes if not detected early

Severe mental retardation

Poor feeding and lack of energy leading to coma in newborns. Delayed development and mental disability in childhood.

Poor feeding and lack of energy leading to coma in newborns. Delayed development and mental disability in childhood.

Treatment if detected early

Special diet and blood monitoring.

Special diet and drugs.

Special diet and drugs.

Outcomes if detected and treated early

Normal intelligence.

Normal growth and development, but some children still suffer delayed development.

Normal growth and development, but some children still suffer delayed development.

In section 2, the additional tests that are being proposed in the UK were described using the same process as before (see columns 1 and 3 for the description of IVA testing used on the card). Descriptions of the interventions were piloted on members of the public to test readability.

The participant was then asked whether they would choose to have these additional tests. If they would choose to have them, they are asked to say how much they would be willing to pay to have them. Possible payments were shown to the respondents on individual cards, after being shuffled in front of the respondent to highlight their random ordering. The values of the cards ranged from a lower limit of £5 and an upper limit of £250. The upper limit was chosen such that it is well in excess of the costs of nuchal fold scans – an ante natal test frequently paid for privately and which may be used as a ‘reference price’ - and the price of the additional screening tests in the United States (typically around $199 or £120). The upper limit was also in excess of ‘open’ responses elicited without reference to the cards from the piloting of the questionnaire.

Participants would then move onto either section 3 or 4. Sections 3 and 4 were virtually identical, but with slightly different wording depending on whether participants in section 2 would want the additional tests or not. These sections describe hypothetical tests for the same conditions as in section 2, but without any false positives being associated with them (see columns 1 and 4 for the description of IVA testing used on the card). Again, the participant is asked whether they would choose to have these tests. If they would choose to have them, they are asked to say how much in addition to their previous amount, they would be willing to pay to have them.

Section 5 collected demographic information relating to the participant, as well as information on their previous decision(s) relating to the ‘heel prick’ test for their child(ren). The participant was then allowed to make any other comments relating to the interview, before being thanked for their help.

Outcome measures

The primary outcome measure for the study is the proportion of the sample that would choose to participate in the extended screening programme.

The secondary outcome measures are:
  1. i.

    The proportion of the sample that would choose an extended programme that had no false positives.

     
  2. ii.

    The mean willingness to pay for the extended programme.

     
  3. iii.

    The mean willingness to pay for the extended programme that had no false positives.

     
  4. iv.

    The reasons for the acceptance, or otherwise, of the tests.

     

The willingness to pay responses were to be summarised as means, and the difference between the extended programme and the extended programme that had no false positives was to be tested using a t-test if normally distributed, or a Mann-Whitney test otherwise.

The validity of responses was to be tested in two ways. Firstly, respondents should prefer the extended programme that had no false positives to the actual extended programme. Responses to this, and reasons for any contrary opinions were examined. Secondly, it is expected that WTP can be explained by respondent characteristics (e.g. income), and this was tested by multivariate analysis of WTP responses to the first WTP question.

Results

1374 courtesy letters were sent out in the areas where sampling took place. In order to achieve the target of 160 completed interviews, 254 eligible households were approached. Thus 94 eligible respondents refused to be interviewed, giving a response rate of 63%. The majority of respondents were female and for five interviews the interview was undertaken by a couple (Table 2). The median age of the sample was 36 years, with 73% either married or living with a partner and 56% having more than one child under the age of 16 years (Table 2).
Table 2

Characteristics of the sample interviewed

Characteristic

Number

Percentage of sample

Gender

  

 Male

24

15.0

 Female

130

81.2

 Couple

5

3.1

 Missing

1

0.6

Age

  

 19-29

33

20.6

 30-39

70

43.8

 40 and over

57

35.6

Marital status

  

 Married

79

49.4

 Living with partner

38

23.8

 Widowed/widower

2

1.3

 Divorced/separated

13

8.1

 Single/living alone

27

16.9

 Missing

1

0.6

Number of children under 16 years old living with the parent

  

 0*

2

1.3

 1

69

43.1

 2

70

43.8

 3

14

8..8

 4

5

3.1

Annual income

  

 Under £10,000

21

13.1

 £10,000 - £19,999

39

24.4

 £20,000 - £29,999

33

20.6

 £30,000 - £39,999

26

16.3

 £40,000 - £49,999

16

10.0

 £50,000 or more

17

10.6

 Don’t know or missing

8

5.0

* Two parents had children under the age of 16 that were not currently living with them

When asked to imagine they were in a position to be offered the current newborn screening, 158 (99%) respondents said they would choose to have it, with the remaining two being unsure. When asked to imagine their likely reaction to receiving a positive screening result, 25.5% indicated they would be extremely concerned, 35.6% indicated moderate concern and 16.9% indicated mild concern. The remaining respondents indicated they would not be concerned or were not clear about their reaction.

In response to being asked to indicate if they would choose to be screened for the five extra conditions in an expanded screening programme, 148 (92.5%) respondents said they would, 10 (6.3%) said they would not and two were unsure. Examination of the reasons given by those who would refuse revealed that six respondents cited the false positives as the key reason, three felt that false positives and/or the rarity of the condition were salient, whilst another respondent refused because they believed that the conditions would be detected anyway without any health effects on the child (Table 3).
Table 3

Reasons given for refusing the extended test

“Because I don't know or can't relate to these conditions and the false positive tests. To me, that is not clear or something I would not want to put myself through. The statistics are not clear enough - too many that could be wrong.”

“The conditions would not bother me because I would find out in the end.”

“Not enough clarity they all could be false. Would need more certainty. Have not heard of the conditions anyway.”

“I don't think the diseases are very common and I don't think they are in my family and therefore not necessary to me.”

“Don't seem to have refined the test enough.”

“The results are so poor you could get a positive reading and it could be wrong and nobody is perfect so you deal with it.”

“Things I have never heard of and taking into account the false positives you could be distressed and anxious for no reason.”

“If the false rate is that high it is a pointless test.”

“Given the statistics - how few would be shown up and the high rate of false positives, e.g. 7 out of 700,000 births - such small numbers wouldn't worry me.”

“I can see that the numbers are very small and the numbers of false positives are high so I would rather not put myself through it.”

When asked to state if they preferred the version of the expanded screening with no false positive results to the version with the possibility of false positives, 151 (94.4%) respondents said they did prefer it, seven (4.4%) said they did not, one was unsure and there was one non-response. Among the reasons given for not preferring the screening with no false positive results, four felt that having the subsequent definitive test would produce a more certain diagnosis, two felt that the screen would prepare them for potentially bad news, with the remaining respondent feeling that the results did not matter.

When asked to indicate if they would choose to be screened under an expanded screening programme with no false positive results, 152 (95%) said they would, five (3.1%) said they would not, two were unsure, and there was one non-response. Respondents refused because they either felt that the test was not worth it to them (four respondents) or they did not want to know if there was anything wrong with their child (one respondent).

Table 4 shows the mean WTP values for the two versions of the expanded screening programmes (with and without false positives). The difference in means of £41.01 is significant (95% CI of the difference £20.69 to £61.34), indicating a willingness on behalf of respondents to pay more for the version of expanded screening with no false positives.
Table 4

Mean willingness to pay (WTP) for the screening programmes and difference in means

Expanded screening

Mean WTP (£)

SE of mean

n

Difference in means (£)

95% CI of difference (£)

With false positives

177.65

24.12

143

  
    

41.01

20.69 to 61.34

Without false positives

218.66

26.30

143

  
A multivariate analysis was undertaken to examine possible determinants of reported WTP. The responses had a considerable right skew and so responses were log-transformed, which produced a near-normal distribution. Two variables were identified as being statistically significantly related to log-WTP; gender and number of children under the age of 16 (Table 5). Female respondents were, on average, willing to pay 52% less than males, whilst the relationship with number of children under the age of 16 was less clear. The relationship with income was not clear, with only a single income band being significantly different from the lowest band.
Table 5

Results from the general linear model of long-transformed willingness to pay

Independent variable

Category

n

Deviation from first category

p-value of deviation

p-value of variable

Intercept

-

-

-

-

<0.001

Gender

Male

22

-

-

 

Female

127

-0.725

0.004

0.016

Couple

5

-0.537

0.297

 

Number of children under 16 years of age

0

2

-

-

 

1

66

1.335

0.070

 

2

67

1.359

0.067

0.038

3

14

0.729

0.337

 

4

5

2.090

0.018

 

Income

Under £10,000

20

-

-

 

£10,000-£19,999

39

0.557

0.040

 

£20,000-£29,999

33

0.354

0.207

 

£30,000-£39,999

25

0.267

0.383

0.350

£40,000-£49,999

16

0.349

0.331

 

£50,000 or more

17

0.559

0.110

 

Don’t know

4

1.032

0.063

 

Interviewer

1

44

-

-

 

2

21

0.089

0.749

 

3

15

-0.056

0.850

0.827

4

16

0.300

0.308

 

5

19

-0.108

0.685

 

6

39

0.178

0.479

 

Paid for previous tests

Yes

14

-

-

 

No

140

-0.128

0.674

0.674

Discussion

Our finding that only 10 respondents from 160 (6.3%) stated that they would choose not to have the expanded screening indicates that there was a high level of support for the screening in our sample even with the possibility of false positive results. The mean value placed on this intervention by the respondents was high and exceeds the economic cost of service provision by at least an order of magnitude. Unsurprisingly, most respondents (94.4%) stated that they would prefer a version of the expanded screening with no false positives to that with false positives if this were possible, thus indicating an aversion within our sample to the possibility of false positive results. The extent of this aversion is indicated by the magnitude of the WTP responses. The WTP increment to avoid false positive was modest when compared with the value placed by the respondents on the availability of an expanded screening programme. The mean WTP for the test with the anticipated rate of false positives was 81% of the mean for the hypothetical test with no false positives.

It is reasonable to conclude from these findings that the possibility of false positive results should not be a barrier to screening but that these results should be avoided where possible and should be carefully managed where this is inescapable. Recent reports support the view that careful communication by well informed practitioners can reduce the negative impact on parents of equivocal results or carrier status for inherited disorders following newborn screening (Kai et al. 2009). Well organised services with access to rapid confirmatory testing and the availability of specialist practitioners are likely to reduce the modest but significant potential negative impact of false positive results arising from screening on affected families.

The impact of false positives on parents has been investigated in several other studies, typically through self-reported quality of life measurement (Waisbren et al. 2003; Hewlett and Waisbren 2006). Such studies can suffer from one or more drawbacks when trying to ascertain whether the screening is worthwhile. Firstly, the impact is limited to the domains of health within the chosen questionnaires; reassurance through the provision of information may not be captured by some measures. Secondly, they do not include negative effects of not having the screen, for example, regret if the child is found to have a particular disorder in the future. Thirdly, it is not possible to combine the quality of life measure with the costs of the scheme to produce an assessment of value for money in a cost-benefit analysis.

One previous study has examined parental willingness to pay to avoid false positives and estimated the mean value to be $159 in parents who had experienced a false positive screen in a child and $544 in parents who had not experienced a false positive (Prosser et al. 2008). Our study differs from this in several important ways. Most importantly, it directly answers the policy question of whether a specific package of additional tests (which includes benefits and risks) is supported by parents, as opposed to whether a false positive in its own right has a significant value. There are also differences in the way in which the WTP values were elicited. Our study was designed using a validated payment card approach (Ryan et al. 2004), includes internal tests of validity and used face-to-face interviews with the latter being a key recommendation of good practice guidelines (Arrow et al. 1993).

Importantly, however, the difference in values between parents reported by Prosser and colleagues suggests that anticipated stress associated with a false-positive is perhaps greater than that which actually occurs.

Our study uses a different approach that can ameliorate some of these problems. A CVM survey describes the process and outcomes of the screen, then allows respondents to factor into their response any issue that they feel is important to them. This means that any effect is not automatically limited to a specific set of health domains, nor excludes negative effects associated with not having a screen. It also makes an evaluation of value for money much easier, as the benefits of the screen are measured in the same units as its cost and so can be directly compared.

An additional benefit of this approach is that it does not need to be limited to parents who have a child that is a positive or a false positive, both of which are very rare. The key factor to make an informed assessment of screening, we would argue, is that the respondents are parents and as such understand the emotional bond with a child and the rigors of decision making on their behalf. This belief was the rationale for the sampling frame being restricted to parents with a child under the age of 16. The extent to which this inclusion criterion may distort our conclusions is not known.

However, there are a number of possible drawbacks to this approach. Firstly, the validity of the WTP technique has been called into question by some (e.g. Cookson 2003) with one particular problem being the insensitivity of responses to the size of the health effect described (Olsen et al. 2004). The results of this study, however, show that WTP is related to the size of the health effect as measured by the false-positive rate. Despite passing this test of validity, sceptics may argue that the absolute WTP estimates may be unreliable, however, the relative size of the estimates for proposed programme and the hypothetical ‘no false-positives’ programme provides added assurance that parents consider the benefits of screening to far outweigh its drawbacks. This test of validity was missing from the WTP estimates reported previously (Prosser et al. 2008). Secondly, it is usually recommended that the cost-effectiveness of health care interventions should be measured using quality adjusted life years (Gold et al. 1996; National Institute for Health and Clinical Excellence 2008). The quality adjusted life year (QALY) approach is much more focused on health effects and so we would argue that its merits are somewhat diminished when looking at a single aspect of process that would be difficult to capture by the quality of life instruments used to generate QALYs.

Finally, we must recognise that the way in which the scenarios were presented and their content, will affect the results. Whilst we took care to write the scenarios in a neutral manner and capture all relevant issues within our descriptions, we can not rule out the possibility that we have introduced a bias. This is difficult to overcome as the presentation of more information can result in cognitive overload, with participants then using mental ‘shortcuts’ to make imprecise judgements (Lloyd 2003).

This study question, or anything similar, has not been answered previously. Several US studies have examined the health-related quality of life (QoL) impacts of extended screening, however, the US extended programme is quite different from that being proposed in the UK. Additionally, these studies did not measure the value of extended screening is a way that can be directly used to assess whether it is worthwhile. CVM allows us to assess whether there is likely to be a positive impact overall once benefits and disbenefits are aggregated. The survey can also produce information that will be useful to decision makers, such as reasons for supporting or opposing the extended screening programme, if the WTP results are considered problematic.

From a policy perspective, we must recognise that WTP rarely plays a significant role in publically funded health care. Whilst the policy goal of maximising health is more important to policy formulation and evaluation, it is increasingly recognised that well-designed studies can provide valuable information relating to specific non-health issues (Ryan and Farrar 2000). With complex policy questions, multiple issues are evaluated and then used in the policy making process in a deliberative manner. We feel that this study provides a valuable contribution to the policy debate surrounding neonatal screening in the UK. However, whilst the results suggest that their is widespread parental support for extended screening and that the number of false-positives is a relatively small issue, we would recommend that an ex post evaluation of these effects be part of any further study of extended screening.

Funding

The research was funded by the South Yorkshire Collaboration for Leadership in Applied Health Research and Care (CLAHRC), which in turn is funded by the National Institute for Health Research (NIHR).The authors confirm independence from the sponsors; the content of the article has not been influenced by the sponsors.

Copyright information

© SSIEM and Springer 2011