Why is it important to get the diagnosis of UTIs right?

We need to diagnose urinary tract infections quickly

Urinary tract infections (UTIs) are common in childhood and cause a considerable health burden. Though many children have mild symptoms and are easily treated, some present severely unwell with urosepsis. UTIs may be associated with renal scarring, and in severe cases with hypertension, and renal impairment. They should be treated promptly to relieve symptoms and ideally within three days in infants < 2 years of age to reduce their risk of developing permanent kidney scars [1, 2]. Virtually all authors, including the American Academy of Pediatrics (AAP) [3, 4] and the UK National Institute for Health and Clinical Excellence (NICE) [5], emphasise the importance of rapid antibiotic treatment in the very young to reduce sequelae.

And we need to diagnose them accurately

Whilst the case for identifying children with UTIs is clear, it is also important not to falsely overdiagnose them. Not every child with dysuria has a UTI; some may simply have vulvitis or balanitis, and others may have a febrile illness and a poor fluid intake resulting in them passing concentrated urine that stings. Diagnosing these children as having UTIs will lead to the misuse of antibiotics and may result in children having unnecessary investigations, which is burdensome and wasteful.

Why can diagnosing UTIs be difficult?

Making any clinical diagnosis involves a complex integration of information, including the child’s prior clinical history, known risk factors such as age and sex, and the probability of a particular symptom or symptom complexes being caused by a particular illness, as well as the results of laboratory tests. Indeed, the power with which a positive or negative laboratory test result can rule in or rule out a diagnosis depends on its prior probability as well as the test’s sensitivity and specificity, as illustrated by leaf plots [6]. False diagnoses can only be confidently ruled out by tests that are highly sensitive, and true diagnoses can only be confidently confirmed by tests that are highly specific. An ideal test is one which can be calibrated to be both highly sensitive and highly specific.

The possible urine tests for UTIs can be divided into those that detect the presence of bacterial themselves and those which detect associated changes, such as increased white blood cell (WBC) numbers or nitrite concentrations. Unfortunately it has always been known that urinary WBC numbers have low diagnostic power [7]; they are neither sufficiently sensitive (they may disappear rapidly [8]) nor specific (they are often present in many children with fever from other causes [9]), and recently it has been shown that urine nitrite sticks miss 77% of UTIs in infants [10]. We are therefore left dependent upon identifying bacteria in urine to make a reliable laboratory diagnosis of UTI in children. First, I will deal with urine culture and then with point-of-care microscopy.

How to create a robust laboratory definition of UTI by bacterial culture?

The primary problem of contamination

There would be no difficulty in diagnosing UTIs by urine culture if uropathogens were rarely found elsewhere and if urine was a fastidious culture medium because then any bacterial colonies identified could be assumed to be pathological. However, this is far from the case, and it is only too common for urine from uninfected children to become contaminated with peri-urethral or skin bacteria. Kass introduced diagnostic bacterial quantification 60 years ago as a strategy to distinguish contamination from urine infection [11, 12], but the best ways to collect children’s urine samples and interpret their culture results remain controversial.

How to define sterile urine and rule out a UTI?

It seems obvious that if a urine sample from a child who is not taking an antibiotic does not grow any uropathogenic bacteria (including coliforms, Escherichia coli, Enterococcus, Proteus, or Klebsiella), then it can be concluded that they do not have a UTI so long as the collection was not made using inappropriate skin antisepsis. In standard clinical laboratory practice, 1 μl of urine is inoculated onto a petri dish, so “sterile” urine with no bacterial growth will be reported if the urine contains < 103 colony forming units (cfu) per millilitre.

It follows that if a child has two or more urine samples collected in immediate sequence and at least one of them was sterile, they do not have a UTI. It further follows that if another urine did grow bacteria, either the child had developed a UTI at that very instant (extremely unlikely) or the specimen was contaminated. Therefore, in publications which report discordant serial cultures, I conclude that one sterile sample excludes a UTI, and the positives indicate contamination regardless of the bacterial count, species, or urine collection method. Similarly, two serially collected urine samples containing different species are evidence that both were contaminated.

Many authors have interpreted data without using this and consistent approach and have drawn illogical conclusions according to their prior expectations. For example, there are six studies of children where paired samples collected by suprapubic aspiration (SPA) and clean voiding have shown discordant results in both directions, where logic indicates that some of the SPA urines must have become lightly contaminated. However, in each case, the authors (who appear to believe that SPA samples cannot become contaminated) have argued that the pairs with void-positive/SPA-negative results demonstrate urethral contamination, whilst SPA-positive/void-negative pairs indicated that the “UTI had been missed” in the urethral specimen [13,14,15,16,17,18]. Similarly, when only the first flow from a urethral catheter sample (where the authors expected contamination might occur) grew organisms, it was called contamination, but when only the subsequent stream was culture-positive, it is assumed that the first flow sterility was due to “non-uniform bacterial excretion in different urine phases” or the “potential randomness of growth of some bacteria” [19].

How to define a highly probable UTI?

Because it is possible for apparently scrupulously collected urine samples from children to become contaminated, there is no scientific way to determine whether any particular single specimen that grows a pure uropathogen indicates a UTI or contamination. However, the probability that it is a genuine UTI increases sharply if exactly the same result is found in a second sample, because the chances of a random false-positive error will be reduced by its square, for example, from 10 to 1% or from 5 to 0.25%. Although it is clearly not possible to reach 100% certainty, imposing a diagnostic criterion of ≥2 identical cultures from published reports greatly increases the confidence (in these examples to 99% or 99.75%) of a genuine UTI, so I have used this “gold standard” method to analyse the literature and have excluded less rigorous studies.

How to determine the best diagnostic threshold for childhood UTIs?

By accepting two identical positive cultures as evidence of a UTI and rejecting all other results, it is possible to independently determine the diagnostic sensitivity and specificity for a range of possible colony count thresholds and to see if this varies with the urine collection method. This approach avoids the bias inherent in most studies where each child has a single sample taken and a particular culture cut-off is pre-determined, and its validity is then judged by seeing how well it seems to separate children according to the clinician’s prior assessment (or worse, where different pre-determined thresholds are allocated to different urine collection methods).

It is important to recognise that routine quantitative culture testing has a limited range which does not reach the true bacterial numbers in most UTIs, which was shown to be between 107 and 108 bacteria/ml in adults by Kass [11]. This means that a typical adult sample bottle will contain around one billion organisms and that 1 μl of urine inoculated onto a petri dish will contain approximately 50,000 viable organisms, where the nutrient agar is only sufficient to sustain about 500 colonies, so only about 1% will grow. Thus, without a pre-culture dilution step, it is only possible to measure bacterial concentrations between 103/ml and about 5 × 105/ml (usually reported as ≥ 105 cfu/ml). Kass warned that contamination during voiding frequently produced urinary colony counts of ≥ 105/ml [11, 12] and would therefore generate false-positive results in adults. Some of the published paediatric studies did use pre-dilution to detect high bacterial concentrations, and some inoculated larger urine volumes to identify very low counts.

Performing meta-analyses to determine the best colony count UTI diagnostic thresholds for voided, SPA, and catheter urine samples

Collecting the material

I searched for papers where children had paired urine samples cultured quantitatively by reviewing those referenced in the AAP and NICE childhood UTI guidelines [3,4,5], by undertaking a MEDLINE (1946 to March 2019) search for “urinary tract infection or bacteriuria” in children (≤ 18 years) and by following up earlier papers not included in computerised databases. I excluded children already known to have structural urinary tract abnormalities and included studies which compared urine collection methods in healthy individuals as well as those with a clinical suspicion of UTI. I also reanalysed the data our group has previously published on bacterial quantitation [20] to look for sex or age effects.

Analysis methods

I used the urine culture “gold standard” described to define children with a UTI as having ≥ 2 urine cultures with the same pure uropathogen and those without a UTI if they had at least one urine culture with < 103 cfu/ml. I have plotted colony counts on log10 axes, used <> symbols to denote concentrations only reported in ranges, and compared their geometric means using unpaired t tests. The standard clinical colony count ranges of 103 to ≥ 105 cfu/ml are shaded in coded colours, higher values obtained by prior dilution are shown in grey, and counts of < 103 cfu/ml produced by culturing 100 μl of urine are shown below the main plots. I either used χ2 or Fisher’s exact test to test for differences in the pooled data, according to the dataset size. I plotted leaf plots to determine the effects of their sensitivity and specificity data on the predictive values of positive or negative test results for all levels of pre-test probability [6].

Meta-analysis 1: voided urines

High-quality quantitative data for voided urines

There were 13 valid papers [13, 20,21,22,23,24,25,26,27,28,29,30,31] including one with two sub-studies [23], giving 14 reports on 1270 children, of whom 106 had UTIs and 1164 did not (Table 1). Most groups studied children suspected of having UTIs, one collected a clean catch sample before catheterising children for micturating cystograms [31], one screened for UTIs in healthy school children [22], and two compared different voided collection techniques in healthy children in hospital [30] or at home [29]. Some studies were rejected as they had insufficiently detailed data [14,15,16,17,18, 32,33,34].

Children with UTIs

The quantitative culture results for voided urines are plotted on salmon pink columns in Fig. 1, with children with UTIs shown by red diamonds. The three right-hand plots show their total study data because these groups only collected voided sample pairs. In the other studies, one sample was voided and its pair was collected by either SPA or urethral catheter as indicated below, and these results are plotted in Fig. 2. The urine colony counts for children with UTIs were similar regardless of the urine collection method, so I have analysed them together. Of the 212 infected urines, 210 (99%) had colony counts of ≥ 105 cfu/ml and two were in the range 104–5 cfu/ml, which gave a sensitivity of 1.00 at a threshold of 104 cfu/ml and of 0.99 at 105 cfu/ml. In the 150 samples where higher concentrations could be counted after dilution, the range was much higher, giving a sensitivity of 0.93 at a threshold of 106 cfu/ml and 0.58 at 107 cfu/ml. The two children whose first samples contained 104–5 cfu/ml both had > 107/ml in their subsequent specimens [23], consistent with them both developing UTIs. A reanalysis of our previously published colony count data [20] showed that the bacterial concentrations were unaffected by sex or age, comparing infants < 2 years, children aged 2 to 9 years, and older children.

Fig. 1
figure 1

Concentrations of uropathogens cultured from voided urines, plotted on a log scale. Red symbols = both urines grew the same organism (UTI). Blue circles = sample grew a uropathogen, but its pair was sterile (contaminant). Yellow squares = mixed bacteria. The salmon boxes indicate standard laboratory culture count limits, and the grey boxes show the extended limits in that laboratory. The <> symbols indicate data that was reported in concentration ranges (e.g., > 105/ml or 103–4/ml). When more than 10 data points fall together, the number is indicated below. The number of sterile urines is shown in a box at the bottom. The starred data point was the first sample from a patient whose subsequent colony counts were > 107 cfu/ml. The technique used to collect the paired urine for each study is indicated at the bottom

Fig. 2
figure 2

Concentrations of uropathogens cultured by SPA (blue columns) or catheter sampling (green columns). Red symbols = both urines grew the same organism (UTI). Blue circles = sample grew a uropathogen, but its pair was sterile (contaminant). Yellow squares = mixed bacteria. The blue and green boxes indicate standard laboratory culture count limits, and the grey boxes show the extended limits in that laboratory. The <> symbols indicate data that was reported in concentration ranges (e.g., > 105/ml or 103–4/ml). For clarity, the number of data points is indicated below in some cases. The number of sterile urines is shown in boxes. The starred data point was the first sample from a patient whose subsequent colony counts were > 107 cfu/ml. The technique used to collect the paired urine for each study is indicated at the bottom (mostly voided)

Children without UTIs

Most of the 1583 voided urine samples from children without UTIs were sterile or had mixed organisms, but 303 (19.1%) were contaminated with a single uropathogen (Fig. 1, blue circles). These would be indistinguishable from genuine UTIs if just one sample was collected per child, giving a specificity of 0.809. Many had high counts, giving the following false-positive rates (specificities): 13.8% (0.862) at 104 cfu/ml, 9.3% (0.907) at 105 cfu/ml, and 1.0% (0.990) at 106 cfu/ml. Urines contaminated with mixed bacterial species including a uropathogen (Fig. 1, yellow squares) had similar colony counts.

Defining the best clinical diagnostic threshold for a single voided urine sample

Setting an ideal diagnostic threshold depends upon the clinical circumstances and typically involves compromise—higher cut-offs risk missing genuine UTIs, whilst lower ones risk making false diagnoses in children with contaminated samples. In addition, the diagnostic impact of test results is greatly influenced by the clinical probability that the child had a UTI in the first place. These judgements are made easier to visualise by using a leaf plot [6] to compare the impacts of using different thresholds at all levels of prior diagnostic probabilities. Figure 3 shows that at a diagnostic threshold of 103 or 104 cfu/ml, a negative result completely excludes a UTI at all levels of clinical suspicion, but a positive one leaves a lot of doubt, especially when the child’s clinical circumstances are not compelling and overdiagnosis will be common. Raising the threshold to 105 cfu/ml reduces the false-positive rate a little, but increases the risk of missing genuine UTIs among children whose clinical likelihood of having one is considered fairly high.

Fig. 3
figure 3

Leaf plot showing the diagnostic values of using colony count thresholds between 103 and 106 cfu/ml on a child’s single voided urine culture result (calculated from the sensitivity and sensitivity data shown). The vertical increase in height of the red line above the diagonal (vein of the “leaf”) shows the impact that a positive culture result has on the probability of that child having a UTI. The vertical drop from the “vein” to the blue line indicates the effect of a negative urine culture on excluding a UTI

Raising the cut-off to 106 cfu/ml would dramatically increase the power to rule in UTIs at all levels of prior probability, though it would also further increase the risk of missing genuine cases compared with using a lower threshold. However, this option would require modification of standard laboratory methods, so the best available cut-off to deliver diagnostic balance is ≥ 105 cfu/ml, which delivers a sensitivity of 0.99 and a specificity of 0.907. Routinely collecting two samples would improve these to 1.000 and 0.991, respectively, which produces a marked increase in diagnostic efficacy as shown by comparing their leaf plots (Fig. 4, top row), but would cause considerable extra inconvenience in clinical practice. This analysis indicates that lower targets substantially worsen the false-positive diagnosis rates and rejects the notion that they would be preferable [4, 5, 35].

Fig. 4
figure 4

Leaf plots comparing the efficacy of various urine testing strategies at all levels of pre-test probability (see Fig. 3 for guidance to interpretation). Top row, clean-voided urine collections of one or two samples. Second row, SPA and catheter collection methods. Third row, comparing the usefulness of nitrite stick testing in infants < 2 years and older children. Bottom row, phase-contrast microscopy screening of one or two samples

Meta-analysis 2: SPA sampling

Background to SPA usage

Low-level bacterial contamination was recognised to occur from the first use of SPA sampling in women [36] and children [16, 37], so colony count thresholds were originally set above 103 cfu/ml to avoid false-positive results. Despite this, Hellerstein cited these publications [38] as if they supported the notion that “any growth” of Gram-negative bacilli from an SPA in a child indicates a UTI, and his “table of recommendations” is the “evidence” that the AAP and NICE guidelines use to justify including this policy in their guidelines [3, 5].

High-quality quantitative data for SPA urines

Twelve studies [13,14,15,16,1725,26,27,28, 32, 33, 37] included 709 children, of whom 114 had UTIs and 595 did not (Table 1), but half of these only reported how many cases exceeded various pre-determined colony counts, leaving six that provided sufficient quantitative data for full analysis [13, 25,26,27,28, 37]. Nearly half the subjects and two-thirds of those with UTIs were aged < 2 years. Two groups only enrolled infant boys [25, 33].

Table 1 Paediatric studies of paired-urine cultures that could be used for three meta-analyses of voided, suprapubic (SPA), and catheter urine collection methods, presented in date order

Children with UTIs

All 23 children with UTIs who had quantitative SPA data [13, 25,26,27,28, 37] had colony counts ≥ 105 cfu/ml (Fig. 2), and three of the five children tested for higher bacterial concentrations had counts ≥ 106 cfu/ml.

Children without UTIs

Most of the 594 children without UTIs had sterile urines, despite some using detection limits of just 10 [13, 25, 37] or 100 cfu/ml [16, 28, 33]. Unlike voided urines, there were no cases with high-concentration contamination, but 33 (5.5%) specimens had small numbers (< 2000/ml) of single uropathogens [13, 17, 25, 32, 37]. The SPA specificity is therefore 1.00 at a 105/ml threshold, but would fall and produce false positives by AAP and NICE criteria (Figs. 1 and 2).

How could SPA samples become contaminated?

The fact that blood cultures are commonly lightly contaminated [39] suggests that skin bacteria may be carried forward on the tip of a sampling needle, and the fact that the flora that contaminated the SPA samples were typically skin commensals [15, 25, 37, 40] and faecal organisms [15, 17, 40] which colonise the nappy area suggests that they may have become contaminated in the same way. Estimates of the bacterial numbers that could fit onto the leading edge of a needle tip are consistent with the reported urine colony counts.

Defining the best clinical diagnostic threshold for SPAs

SPA samples are 99% sensitive and 100% specific for diagnosing UTIs at the same 105 cfu/ml threshold that is the best for voided urines, so it is sensible to use the same values. Choosing to perform a single SPA would therefore avoid the approximately 10% false-positive risk seen with voided urines and almost guarantees to correctly diagnose a UTI, as shown in a leaf plot (Fig. 4).

It is unclear how the assertion that “any growth” in SPA urine (≥ 103 cfu/ml in standard laboratories) is diagnostic of a UTI was adopted without any scientific basis. This concept has been sustained by studies with biased designs and illogical analytic methods [13,14,15,16,17] and continues to enjoy wide support [3,4,5, 35, 38].

Meta-analysis 3: urethral catheter sampling

Background to urethral catheter sampling

Like SPAs, the AAP recommends urethral catheterisation to avoid bacterial contamination [3], but in this case they advise using a 105 cfu/ml threshold of a single uropathogen to diagnose UTIs [38]. Four other thresholds have been advocated on little evidence, namely 5 × 104 cfu/ml [19], 104 cfu/ml [19, 34, 41], 103 cfu/ml [18, 35, 42,43,44,45], and 10 cfu/ml [46]. In addition, the AAP now also endorses discarding the first drops of catheter urine [4] with little evidence.

High-quality quantitative data for catheter urines

Six eligible publications [21,22,23,24, 31, 37] containing seven studies enrolled 508 children (Table 1), and all but one [22] cultured the “mid-stream” urine flow after discarding the first few millilitres. I rejected 15 studies because they introduced a selection bias [41,42,43,44, 46,47,48], used discrepant culture thresholds for catheter and non-catheter samples not providing quantitative data [18, 34, 43, 46, 49,50,51], or provided incomplete data [52].

Children with UTIs

Forty-nine of the 50 children with UTIs grew ≥ 105 cfu/ml (Fig. 2); the one child whose first catheter sample grew 104–5 cfu/ml and next voided sample grew > 107/ml has already been described. As with SPA and voided urines, most catheter samples tested for higher bacterial concentration ranges had counts > 106 cfu/ml.

Children without UTIs

The catheter urines from the uninfected children grew fewer contaminants than the voided specimens (Fig. 1), but more than the SPA samples (Fig. 2). The specificity for 473 specimens from 458 children was 0.829 at a threshold of 102 cfu/ml, 0.960 at 103 cfu/ml, 0.983 at 104 cfu/ml, and 0.998 at 105 cfu/ml.

Should the initial catheter urine stream be discarded?

Two groups showed that the first drops of catheter urine were more likely to be contaminated than the subsequent stream [37, 53], but these cultures were clinically irrelevant skin commensals and very light growths of uropathogens (all < 104/ml). A more recent study that recommends discarding the first flow of catheter urine had multiple shortcomings including only reporting culture positivity rates at fixed threshold ranges and using biased analysis [19].

Defining the best clinical diagnostic threshold for catheter samples

For catheter samples (collected without discarding the first flow of urine), the threshold of 105 cfu/ml delivers 99% sensitivity and 99.8% specificity, so again it would be sensible to use the same diagnostic cut-off for all sampling methods. The leaf plot in Fig. 4 shows that a single catheter sample is marginally better at ruling UTIs in than collecting two voided samples and marginally worse than using an SPA, but no better at ruling them out than a single voided urine sample.

Conclusions about which diagnostic colony count to use

Sensitivity

When I only analysed high-quality paired-sample publications without making prior assumptions, rather than relying on single-sample cultures in case series that also relied upon clinical opinion, a clear picture emerged that children with UTIs have similarly high concentrations of bacteria in their urine as Kass showed for adults over 60 years ago [11, 12]. Most had between 105 and 108 bacteria per ml, which translates to ≥ 105 cfu/ml in standard laboratory reports. In over 200 reported high-quality childhood UTI cases, only two children had lower values (104–5), which subsequently rose to 107/ml, giving a sensitivity of 0.99 regardless of the urine sampling method used. This study shows that adopting a universal diagnostic threshold of 105 cfu/ml will correctly diagnose almost all genuine UTIs and that there is no evidence to support the need for using lower thresholds under any circumstances, such as for SPAs [3,4,5].

Specificity

The false-positive rate of childhood urine culture is markedly dependent upon the collection technique used, with the specificity being lowest for single voided urine samples, making it more difficult to confidently rule out UTIs and thereby potentially leading to children being treated and investigated unnecessarily. Leaf plots demonstrate that trying to increase the specificity by lowering the diagnostic colony count below 105/ml will lead to children with genuine UTIs being missed. How else can the false-positive rate be reduced? We have shown that this can be achieved by using more invasive urine collection methods or by collecting paired samples; a third approach could be by integrating point-of-care diagnostic screening. Here, I consider these in turn.

Invasive urine collection

Both SPA and catheter samples (without any need to discard the first drops) are relatively easy to collect in hospital settings and have excellent specificity when a diagnostic threshold of 105 cfu/ml is used, at 1.00 and 0.998, respectively. Hence, meaningful answers can be almost guaranteed from a single invasively collected specimen (Fig. 4). These methods allow samples to be obtained promptly so long as the child has some urine present in the bladder, which may be important in ill babies who need an urgent infection screen before antibiotic therapy is commenced. Urethral catheterisation can be challenging in very low–birthweight babies, but using firmer feeding tubes or umbilical arterial lines may be helpful [40, 47], and SPAs are often hard to obtain without ultrasound guidance, making them less convenient to perform. In very small babies, failed catheter sampling is sometimes followed by successful SPA collections, and vice-versa [40, 47]. However, invasive collection methods are impractical outside of clinical settings, and many paediatricians prefer not to use them because they may be traumatic [54].

Collecting paired voided samples

Although routinely collecting a second voided urine sample greatly improves diagnostic efficacy (as shown in Fig. 4), practical considerations mean it is unlikely to be adopted widely (though it is helpful that gentle suprapubic stimulation with a cold fluid–soaked gauze makes babies void more promptly [55]). However, we have found that advice to collect a second sample in selected cases after immediate point-of-care screening is consistently followed.

Incorporating point-of-care screening

Unfortunately, nitrite sticks, which are the most convenient point-of-care urine test to detect UTI in older children, miss three-quarters of cases in children aged < 2 years, so they cannot be used to exclude the diagnosis in the most vulnerable children [10]. Figure 4 shows leaf plots for both age groups.

In Newcastle, specially trained paediatric nurses undertake immediate phase-contrast microscopy which reliably identifies bacteria in fresh unspun urine with 100% sensitivity but a lower specificity of 0.686 [56] (Fig. 4, bottom row). In non-emergency situations, we collect a single voided mid-stream or nappy pad urine sample according to age and use phase-contrast microscopy to rule out two-thirds of the uninfected children at once, which enables us to discard those samples (with cost-savings) and inform the families of this result immediately. We then collect second samples from those children where bacteria were present, some due to contamination and some due to UTIs (these typically have hundreds of identical organisms per high-power field, equivalent to about 106–7/ml), which most families accept as worthwhile because the first result is uncertain. This increases the specificity to 0.90. Children with bacteria in both specimens can then be presumed to have a UTI and started on antibiotics. If we remain suspicious of contamination in both samples because of scant organisms or an excess of epithelial cells, we may collect further samples after very careful washing.

Future research

For future research in this area to add value, it is critical that studies must use rigorous methods to define UTIs, whether those involve invasive or paired sampling. They must also culture all their samples identically and publish raw quantitative data to allow independent analysis and not choose different diagnostic colony count thresholds for different collection methods. Even very large studies cannot contribute knowledge if they use single voided samples and a diagnostic threshold of 103 cfu/ml in some cases and count some mixed growths as positive [57, 58].