Idiopathic normal pressure hydrocephalus (iNPH) is a cerebrospinal fluid (CSF) shunt-responsive syndrome involving gait disturbance, dementia and urinary incontinence without antecedent disorders, in the elderly. Hakim and Adams first reported improvement of NPH symptoms by removal of 15 ml CSF using a lumbar tap [1]. Wikkelsø et al. reported that the tap test (TT) with removal of 40-50 ml CSF was useful for diagnosis and the prediction of shunt response in NPH patients [2]. Since then, there have been a number of studies using removal of CSF volumes via a lumbar tap to predict shunt effectiveness in iNPH patients [310]. Because it is easy to perform in neurosurgical and neurological clinics, the Japanese guidelines for management of iNPH recommended TT as an initial invasive test [11, 12]. The specificity and sensitivity are reportedly high, but there is some disagreement regarding this between different reports [36]. Continuous lumbar drainage for several days with removal of a large CSF volume has been reported to have high sensitivity and specificity [1317], but it is more invasive for elderly patients that have difficulty in gait, cognition and/or urination. From a clinical standpoint, the effort in performing a TT to increase the predictability of shunt effectiveness is worthwhile, but there has been no prospective validation study in a large number of iNPH patients. In this study, the predictive value of TT was investigated in patients with iNPH using data from a multicenter, prospective study named "Study of idiopathic normal pressure hydrocephalus on neurological improvement; SINPHONI [18]. Special attention was paid to sensitivity and specificity for a number of variables measured before and after the TT.

This study is registered with, with the number NCT00221091.

Materials and methods


In 2004, a multicenter, prospective study of idiopathic normal pressure hydrocephalus (SINPHONI) was conducted in Japan [17]. Briefly, it was designed to validate the diagnostic importance of high-convexity tightness in coronal-section MRI [19] with the results of shunt surgery using a programmable valve. The entry criteria were as follows; (1) 60 to 85 years old, (2) one or more of the NPH triad symptoms, (3) ventriculomegaly (Evans Index > 0.3), (4) high-convexity tightness in coronal-section MRI, and (5) no antecedent disorders. The study consisted of one-year registration and one-year follow-up, and was completed in 2006. Data were obtained from 100 patients. The study was a multicenter prospective cohort study conducted in compliance with the Guidelines for Good Clinical Practice and the Declaration of Helsinki (2002) of the World Medical Association. The institutional review board at each site approved the study protocol, and all participants (or their representatives when applicable) gave written informed consent for participation.

Tap test

A lumbar tap with removal of 30 ml of CSF was performed in all patients. CSF pressure (CSFP) was measured at the site of puncture. Before and after the tap, all patients were evaluated using the iNPH grading scale (GS) [8], the Mini-Mental State Examination (MMSE) and the 3-meter timed up-and-go test (TUG). The iNPHGS is a clinician-rated scale to rate separately the severity of each of the triad symptoms of iNPH (disturbances of gait, cognition and urination). The score of each domain ranges from 0 to 4. Grade 0 indicates normal and grade 1 indicates subjective symptoms but no objective disturbance. Grade 2, 3 and 4 indicate mild, moderate and severe disturbances, respectively. The change of gait was evaluated 1 or 2 days after the tap, while change of cognition and urination was evaluated at one week. Assessment was done by neurosurgeons in most cases. Response to the TT was pre-defined by three major scales: iNPHGS, TUG and MMSE. An improvement in one point or more on the iNPH grading scale (each domain and their total), more than 10% improvement in time on TUG, or more than 3 points improvement in the MMSE was regarded as TT-positive. Improvement in any of the total scores of iNPHGS, TUG or MMSE was defined as positive with an additional variable of Tap-any. The sensitivity and specificity of these pre-defined variables as predictors of a response to shunt surgery were calculated. Furthermore, to increase predictability in the responders during clinical practice, a decision tree analysis was applied.

Shunt surgery

A ventriculo-peritoneal shunt with a Codman-Hakim programmable valve™ (Codman, Johnson and Johnson, Raynham, MA, USA), with the initial pressure setting determined from a quick reference table [20] was installed in all patients within two months after registration. The modified Rankin scale (a scale for measurement of disability) [21] was used as the primary outcome measure, and iNPHGS, TUG and MMSE as secondary outcome measures. Assessment was performed before, and repeated at 3, 6, and 12 months after surgery to determine which patients were shunt responders. A shunt responder was defined as someone who showed an improvement of one point or more on the modified Rankin scale over 12 months.

Data analysis and statistics

Statistical analysis was performed using JMP statistical software version 9 (SAS Institute, Cary, USA). Statistical comparison was made between shunt responders and non-responders on baseline data, and pre-tap state of iNPHGS, TUG and MMSE (Table 1). Baseline variables include age, Evans index, and CSFP. Pre-tap variables included scores of the three iNPHGS domains (GS-Gait-pre, GS-Cogn-pre, GSs-Urin-pre) and their total scores (GS-Total-pre), MMSE scores (MMSE-pre), and TUG completion times (TUG-pre). These variables were compared between shunt responders and non-responders using chi-squared test. TT-positive patients were counted for each of the variables (GS-Gait-change, GS-Cogn-change, GS-Urin-change, GS-Total-change, MMSE-change and TUG-change), and their sensitivity (%) and specificity (%) were calculated using contingency table. Positive predictive values were not calculated, since they would have been affected by the high prevalence of iNPH in the patient group. Furthermore, a decision tree analysis was performed to determine a practical method for selecting shunt responders with higher sensitivity and specificity. The variables included age, Evans index, CSFP, GS-Total-change, TUG ≥ 10% and MMSE ≥ 3. The former and latter three variables were regarded as continuous and nominal data, respectively. The level of statistical significance was set to p < 0.05.

Table 1 Baseline characteristics in shunt responders and non-responders measured before tap test.


Among the complete patient group, gait disturbance was noted in 91%, cognition disturbance in 80% and urination disturbance in 60%. The number of males vs. females was 58 vs. 42, and median age was 75 (range75-78; 25%IQR-75%IQR) years. Median Evans index was 34.6 (range 32.7-38) and median CSFP at lumbar tap was 12 (range 9-14) cm H2O.

In this study on the diagnostic performance of TT in a total of 100 patients, 80% were shunt responders during the one-year follow-up. Among the 80 shunt responders, improvement of one, two, three or four points on the modified Rankin scale was found in 43, 27, 8 and 2 patients, respectively. Comparison of the preoperative variables between shunt responders and non-responders showed a statistically significant difference for CSFP in that the CSFP was higher in shunt responders, p < 0.05 (Table 1). There were no significant differences in Evans index or severity of GS symptoms, TUG or MMSE. The incidence of severe adverse events (SAE) was statistically higher in the non-responders (p < 0.005). Among the non-responders, pneumonia was noted in three and surgery-related complications in two (shunt malfunction and bowel injury), while vascular events including cerebral and cardiac infarction in three and femoral fracture in two, occurred among the responders (Table 1).

The sensitivity and specificity for each of the variables were calculated from the number of true positives, true negatives, false positives and false negatives (Table 2). The highest sensitivity was for Tap-any at 92.5%, but its specificity was low at 20%. The highest specificity of 85% was noted on GS-Cogn-change and GS-Urine-change. However, their sensitivity was below 40%. GS-Total-change showed 71.3% sensitivity and 65% specificity. Thus, the sensitivity and specificity changed with different variables and improvement of total score in iNPHGS, which showed sensitivity of 71.3% and specificity of 65%, was most promising among the pre-defined variables.

Table 2 Numbers of patients, sensitivity and specificity for each of variables examined.

To increase predictability of the TT, a decision tree analysis was applied using the variables of age, Evans index, CSFP, GS-Total-change, TUG ≥ 10% and MME ≥ 3 (Figure 1). GS-Total-change was selected as the first node followed by CSFP ≥ 15 cm H2O as the second node for differentiating the remaining patients. Using this calculation, the sensitivity was 82.5% and the specificity was 65%.

Figure 1
figure 1

Decision tree analysis for selecting shunt responders. At the first step, 57 shunt responders (SR) among 64 patients with improvement of any domain in iNPHGS [GS-Total-change (+)] group were selected as positive cases. At the second step, nine SR were selected from the 36 patients without improvement in iNPHGS [GS-Total-change (-)] group with the variable of CSFP greater than 15 cm H20. This resulted in 82.5% of 80 SR patients being identified in two steps.


The response to a lumbar tap test (TT) is considered to be useful for predicting a favourable response to shunt surgery, particularly in iNPH patients. In previous studies, the volume of CSF removed has varied from 30 ml [6, 8], 40 ml [4, 7], to 50 ml [2], or until pressure was lowered to zero [5]. In the present study, 30 ml CSF was selected because it was less invasive for the elderly patients. One of purposes in the SINPHONI study was to clarify the sensitivity and specificity of the removal of 30 ml CSF for predicting the response to shunt surgery. The present study was designed to detect the change of symptoms as efficiently as possible, after one or two days after the TT for gait and after one week for cognition and urination. Improvement of gait after removal of CSF, was most commonly seen and it could be observed within one or two days after the tap. Recently, Virhammar et al. recommended assessment of gait within 24 hours [10]. Improvement of cognition and/or urination is usually more delayed, which was experienced through our preliminary studies including the report by Kubo et al. [8]. One disadvantage of the study design was that assessment of iNPHGS, TUG and MMSE was not performed by the same person throughout. This may have caused some inconsistency in the results. This is in contrast to the report by Kubo et al. [8]. The MMSE alone would not have been adequate to assess the response to TT. However, it is popular for assessment of cognition in general. Examining the prognostic value of the MMSE was one of objectives in the SINPHONI study.

The sensitivity and specificity of the TT have been reported previously as ranging from 72% to 100% for the former and from 33% to 100% for the latter [3, 5, 7, 8]. The specificity of the TT was reported to be high with low sensitivity [3, 7, 8], but another report was contradictory [5]. In the present study, the specificity of gait domain was 80% but sensitivity was 51.3%. The cognition and urination domains showed a specificity of 85% in both, but a low sensitivity of 25% and 37.5%, respectively. Thus, the present study revealed a high specificity with low sensitivity in each domain of iNPHGS, which agrees with previous reports [3, 7, 8]. In contrast to each domain of the iNPHGS, the total GS score showed higher sensitivity of 71.3% but lower specificity of 65%. Among pre-defined variables, the calculated variable of Tap-any showed the highest sensitivity of 92.5%, but the specificity was only 20%. Thus, the sensitivity and specificity of the TT depended on the variable under consideration. In clinical practice, a higher sensitivity would be more preferable for a diagnostic test, although higher specificity is also important to reduce the false positive cases. To increase, both sensitivity and specificity, a decision tree analysis was applied in the present study, which revealed a first node of GS-Total-change. Among the remaining patients, CSF pressure at 15 cm H2O was the best threshold for differentiation. This increased the sensitivity to 82.5%, while the specificity remained at 65%. This suggested that patients with higher CSF pressure would be shunt responders even if their symptoms did not improve by one point or more in the iNPHGS after TT.

In contrast with TT, continuous CSF drainage has been reported to provide higher sensitivity and specificity, ranging from 50% to 100% and from 60% to 100%, respectively [7, 1416]. As Marmarou stated, the advantage of continuous CSF drainage is increased sensitivity [13]. Drainage of a larger CSF volume simulates a closer intracranial situation to that following CSF shunt surgery. However, it must be highlighted that most studies involving larger volume drainage, defined shunt responders with symptomatic improvement [7, 1416], not with improvement of daily life activity. In SINPHONI, shunt responders on the iNPHGS, i.e., symptom-basis, were 89% in contrast with 80% on the modified Rankin scale, i.e., function-basis [18]. Thus, caution is needed when comparing the present data with those obtained after larger volume drainage. Although complications were reportedly very low in larger volume drainage [14, 15, 19], there is potentially more risk for complications in patients who are elderly with a greater or lesser degree of disturbances in gait, cognition, and/or urination.

The SINPHONI study revealed high achievement in the treatment of iNPH patients without support of the TT [18]. The SINPHONI study showed the high predictability and diagnostic importance of MRI features of tight high convexity and enlarged Sylvian fissure with ventricular dilatation, which was designated as "Disproportionately Enlarged Subarachnoid-space Hydrocephalus (DESH)" [18]. However, Iseki et al. reported there were asymptomatic people with MRI features of iNPH in their population-based study [22]. They may have been potential candidates for developing iNPH in the future. Because NPH symptoms are often difficult to differentiate from those of other senile disorders, it is important to see the changes of symptoms after the TT or larger volume drainage. To increase the sensitivity of the TT, further effort is necessary.


The value of the TT for predicting shunt effectiveness was investigated in iNPH patients using the SINPHONI data. The sensitivity and specificity changed with different variables and improvement in any iNPH grading scale showed a sensitivity of 71.3% and specificity of 65%. A decision tree analysis revealed that any improvement on iNPHGS followed by inclusion of patients with CSFP higher than 15 cm H20 increased the sensitivity up to 82.5% without a decrease in specificity. Thus, the TT is valid as an initial invasive test to predict the response to shunt for elderly patients having disturbances of gait, cognition and/or urination.