Introduction

Diagnosis of ACL injury has been greatly improved in the last decades, thanks to the development of imaging modalities for the diagnosis of musculoskeletal diseases. Nonetheless, it is still strictly based on the clinical examination and functional tests that measure knee laxity and by assessing the functional competence of ACL. Among various tests, the Lachman test and the pivot shift test are the most accurate for diagnosing ACL insufficiency, both in acute and chronic conditions [23].

Quantification of ATT has not only diagnostic value in ACL injuries but also has significance in the outcome assessment of reconstructive surgery [11]. For this reason, various devices called “arthrometers” have been developed since the 1980s. These include their progenitor KT 1000 (MEDmetric Corporation, San Diego, CA, USA), Rolimeter (Aircast, Summit, New Jersey, US), GNRB (Genurob, Laval, France) and many others. Numerous studies evaluated the reliability and accuracy of these devices, both in clinical and experimental settings. A recent consensus [22] has encouraged their use to increase the reliability and validity of the assessment of anterior knee laxity, especially for follow-up evaluation of surgical treatments.

Indeed, the reliability and diagnostic accuracy of knee arthrometers can be affected by several factors, such as knee position, examiner’s experience and patient’s compliance [19]. Besides, some devices are unsuitable for the outpatient setting because of their size or costs. An instrumented laxity testing device should handle all these issues to be as accurate and reliable as possible and suitable for every condition of use.

The purpose of the present study was to evaluate intra- and inter-observer reliability and diagnostic accuracy of a new portable device for testing ATT, which, unlike the devices already on the market, is suitable for both clinical and experimental settings. The hypothesis of the study was that the new arthrometer has good validity in terms of both reproducibility and diagnostic accuracy.

Material and methods

The local IRB and Ethic Committee approved the study protocol (Prot. No. 4767, University of Brescia, ASST Spedali Civili). The study was conducted according to the principles of good clinical practice and of the Declaration of Helsinki and its updated version (Tokyo 2004). The study was designed as a prospective observational study with a control group. Guidelines established by the QAREL checklist for reliability study were followed [10]. Also, standards for reporting diagnostic accuracy studies (STARD) were adopted to assess diagnostic accuracy [3].

Study population

All patients undergoing knee surgery for ACL reconstruction or other intraarticular surgical procedures (meniscal or cartilage treatments) were considered eligible for the study. Inclusion criteria were: age older than 18 years, and acceptance to enter the study. The patients were divided into two groups according to the presence of an ACL rupture or not. ACL injury group consisted of patients with clinically and MRI-confirmed ACL injury and scheduled for ACL reconstruction surgery. Control group consisted of subjects undergoing knee arthroscopy for other injuries (meniscal and/or cartilage), in which ACL integrity was documented preoperatively. In all cases, enrolment was confirmed at the time of arthroscopy when ACL status was definitively confirmed. Exclusion criteria were: osteoarthritis to one or both knees documented on preoperative radiographic examinations, history of trauma or previous surgery to the contralateral knee, and inflammatory or neurologic diseases (systemic or local).

Testing device

Preoperative assessment of anterior knee laxity was performed using a new device for quantification of ATT (BLU-DAT; FGP srl, Dossobuono, VR, Italy).

The BLU-DAT testing device is designed to measure the anterior translation of the tibia respect to the femur on the sagittal plane. Displacement is measured by the mean of a magnetic linear encoder whose mobile part is applied to a sliding rod enveloped in a guide (the probe), whereas the feeler is fixed to the arthrometer body (Fig. 1). Measurement of anterior tibial translation relative to the femur is shown on the device display. The resolution of the ATT measurement is 0.1 mm. The device is also equipped with sensors that evaluate the degree of knee flexion during the test, thus allowing one to check the proper knee flexion angle according to the clinical testing (i.e., Lachman test and anterior drawer test) (Fig. 2).

Fig. 1
figure 1

AThe BLU-DAT laxity testing device. Displacement on the sagittal plane is measured by a magnetic linear encoder whose mobile part is applied to a sliding rod enveloped in a guiding probe, which is attached to the body of the arthrometer. B Correct position with the upper support positioned on the patella, the probe on the anterior face of the tibia, and the lower support at the level of the distal tibia, with the knee at 30° of flexion

Fig. 2
figure 2

The examiner can control the variables during the examination on the digital display, where the knee flexion angle, anterior tibial translation expressed in mm and applied force expressed in kilograms can be viewed

The arthrometer has two supports: the proximal one should be placed at the level of the patella, whereas the distal one on the distal tibia. The right location of the device is achieved by making the probe falling on the anterior aspect of the proximal tibia. The system can be connected by Bluetooth to an accessory dynamometer that allows to quantify the applied force (Fig. 3). This extension allows to combine displacement data to the force applied while performing the test.

Fig. 3
figure 3

A The dynamometer, connected via Bluetooth to the system, allows the applied force to be recorded. B The dynamometer is placed on the examiner's hand applying anterior traction to the tibia

Evaluation

All patients underwent a preoperative assessment of anterior knee laxity by measuring ATT with BLU-DAT. The Lachman test was used to assess ATT, as it has proven to have high diagnostic accuracy [7]. Measurements in the ACL-injured group were performed by two investigators: a sports-medicine experienced orthopaedic surgeon (examiner A) and (examiner B). The examiner B repeated the measurements on the ACL-injured group after three weeks and acquired measurements in the control group. Both examiners were trained in the use of the device before the start of the trial.

All measurements were performed in a standardized setting, with a knee at 30 degrees of flexion in neutral rotation with the aid of a semi-rigid wedge placed at the level of the popliteal fossa. Once the arthrometer had been correctly applied, as previously described, the display was reset (ATT = 0 mm) and the Lachman test was performed. Lachman test was performed at three different loading conditions: 7 kg (69 N), 9 kg (88N), and MMT. Traction forces were chosen based on prior validation studies for knee arthrometers [1].

Three measurements of ATT, expressed in millimeters (mm), were collected for each measurement series and at every loading condition on the affected and contralateral healthy knee in both groups. The mean value of three consecutive measurements was calculated for each measurement series and ATT was expressed as a difference in mm between the mean ATT of the affected and contralateral knee (side-to-side difference).

The order in which the patients were assessed by the two examiners at first evaluation was varied according to a randomized sequence to limit the risk of assessment bias due to the effect of the potential reduction in patient compliance during the subsequent series of instrumental assessments related to the potential different skill of the examiners (examiner bias) [18]. Each investigator was blinded to the results obtained by the other investigator.

Outcome measures

The primary outcomes of the study were the intra- and inter-observer reliability of the ATT measurements in the ACL-injured subjects. Reliability measurements for every loading condition were expressed by the intraclass correlation coefficients (ICCs). The ICC values vary between 0 and 1 (assuming perfect reliability for values of 1) and reliability was interpreted as follows: ICC < 0.5 = poor, ICC between 0.50 and 0.75 = moderate, ICC between 0.75 and 0.90 = good and ICC > 0.9 = excellent.

Secondary outcome of the study was diagnostic accuracy. We estimated the ability of the instrument to detect a difference in ATT between the two groups (ACL-injured and healthy knees). The threshold of 3 mm of side-to-side difference in ATT was considered pathologic and diagnostic for an ACL rupture [16, 24].

Statistical analysis

Data were analyzed using statistical software (IBM SPSS Statistics 25; IBM, Armonk, NY, USA). For the measurement of intra-observer reliability, a two-way mixed-effects model of ICC for average measures (k = 2) and absolute agreement was used. For inter-observer reliability, we calculated ICC using a one-way random effect model for average measures (k = 2) and absolute agreement (ICC 1,2) [13]. In addition, from the obtained ICC and standard deviation of each measurement series, the accuracy of the measurements was established by calculating the SEM between the observations.

Diagnostic accuracy was assessed by comparing the side-to-side differences of the two groups at every loading condition by means of Student’s t test. Then, data of the two groups were dichotomized based on the threshold value of 3 mm as pathological ATT (positive test) and 2 × 2 contingency tables were used to assess diagnostic accuracy in terms of sensitivity, specificity, NLR, PLR, PPV, NPV, AUC, DOR, Youden index, and overall accuracy. Statistical significance was considered for p values < 0.05. Ninety-five percent confidence intervals were provided for each statistical test. Sample size was calculated based on the primary outcomes of the study (intra- and inter-observer reliability) and established in accordance with Walter et al. [26] for reliability studies by calculation of the ICC based on two observations. A sample size of 39 cases was appropriate based on a reliability coefficient equal to 0.5 for the null hypothesis (R0) and equal to 0.7 for the alternative hypothesis (R1), given α equal to 0.05 and a power (1 − β) equal to 0.80 [2].

Sample size was confirmed to be adequate also for the secondary outcome (diagnostic accuracy). Based on a pilot sample of the first 20 cases analyzed with a definite diagnosis of ACL injury, the mean side-to-side difference value for ATT observed was 4.34 ± 5.24 mm. Given a minimal clinically important difference equal or greater than 3 mm between the ACL-injured group and the control group [16, 24], an effect size of 0.57 and a sample size of 39 cases per group was obtained based on a two-way alternative hypothesis, given a value of α equal to 0.05 and a power (1 − β) equal to 0.80.

Results

Seventy-eight subjects were consecutively recruited for the present study and divided into two groups of 39 patients each (Fig. 4). ACL-injury group consisted of 33 males and 6 females (mean age: 29.8 ± 13.2 years); the control group consisted of 25 males and 14 females (mean age: 38.3 ± 11.9 years).

Fig. 4
figure 4

Study flowchart

Descriptive data for all measurement series on ACL-injured patients are reported in Table 1.

Table 1 Descriptive data for all measurement series on ACL-injured patients

Intra-observer reliability was good for 7-kg and 9-kg measurements and excellent for MMT. Inter-observer reliability was moderate for 7-kg measurements ang good for 9-kg and MMT. Intraclass correlation coefficients for intra- and inter-observer reliability are shown in Table 2. SEMs are reported in Table 3.

Table 2 Intraclass correlation coefficients for intra- and inter- observer reliability
Table 3 Standard error of measurements for intra- and inter-observer reliability

Comparison between ACL-injured and control groups for the side-to-side difference in ATT showed a significant difference between the two groups at every loading condition. Mean difference between groups ranged from 3.38 mm for 7-kg measurement to 4.55 mm for MMT (Table 4).

Table 4 Comparison between groups for anterior tibial translation (mean ± SD) at different loading conditions

Output of diagnostic accuracy estimates is reported in Table 5. As no false positives were reported at all loading conditions, PLR, AUC, DOR and Youden index could not be calculated. Overall accuracy ranged from 84.6% for 7-kg measurements to 98.7% for MMT. Detailed diagnostic accuracy measures are reported in Table 6.

Table 5 Contingency table for diagnostic accuracy of ATT at different loading conditions
Table 6 Diagnostic accuracy measures for ATT at different loading conditions

Discussion

The main findings of the present study are that the BLU-DAT has proven to be a valid instrument for testing anterior knee laxity and that the greater were loading conditions the better were results in terms of intra- and inter-observer reliability for measuring ATT.

Since the introduction of the KT-1000, the progenitor of arthrometers, numerous studies analyzed the reliability of these instruments to reproducibly diagnose ACL injury. KT-1000 and KT-2000, its successor, were certainly the most analyzed. The comparison of reliability results is not easy, as there are numerous studies in the literature that reported discordant reliability data, from poor to excellent, especially for inter-observer reliability with ICCs ranging from 0.41 to 0.92 [20, 27], but also often used different reliability assessment methods.

A recent study by Runer et al. [17] showed that the reproducibility analysis of different arthrometers (KT-1000, Rolimeter, KLT and Kira) was satisfactory for intra-observer reliability, but unsatisfactory for inter-observer reliability, thus concluding that anterior laxity analysis is best analyzed by the same operator. In another study, Murgier et al. [15] analyzed the agreement of measurements between different arthrometers (KT-1000, Rolimeter, GNRB and Telos) and found that different devices are difficult to compare in terms of side-to-side difference.

As mentioned above, there are several confounding factors that can undermine the reliability of these devices. These include the type of test used, experience of the examiner, applied force, knee flexion angle [9], tibial rotation and patient’s compliance. Those limitations led to create devices that reduce the role of the examiner, like the GNRB, whose reproducibility and validity has been tested with optimal results [4, 12]. This automated arthrometer exerts via a linear jack thrust forces chosen by the examinator with the lower limb positioned in a rigid leg support with the knee at 0° of rotation. In addition, surface electrodes are applied to the back of the thigh to control hamstring relaxation of the tested knee (feedback effect). Several studies in the past have evaluated its reproducibility, the most recent evaluation was performed by Smith et al.[21] who found moderate to good intra-rater reliability (ICC = 0.72–0.83) and good inter-rater reliability (ICC = 0.76–0.81). Unfortunately, automation required to minimize the role of the confounding factors results in a loss of handling and an increase in costs, making it difficult for routine outpatient use, and therefore, not comparable to portable devices like Rolimeter. Validity of Rolimeter was tested in several studies [5, 17]. Hatcher et al. [6], in a setting very similar to that of our study, analyzed its reliability and reported excellent intra- and inter-observer reliability (ICC = 0.912 and 0.945, respectively) during the Lachman test at 30° of knee flexion. However, extreme handling can prevent control of some variables involved and can impair the standardization of the testing setup.

The BLU-DAT can be a good compromise between manageable arthrometers like Rolimeter, and more complex and expensive devices such as GNRB and KT-1000, being able to keep excellent handiness and maintaining an objective control of the knee flexion and of the applied force, thus allowing standardization of the test also in the routine outpatient activity. To confirm this, the evaluation of its reliability in a cadaver study was recently published [14]. Specifically, inter-rater reliability was evaluated under the same loading conditions as in the present study with ICC for average measurements very good at all different loads (0.89, 0.85 and 0.88 at 7 kg, 9 kg and MMT, respectively). In addition, agreement of the BLU-DAT with a gold standard such as stress radiographs analyzed with the Bland–Altman method showed good agreement with a mean difference between the two methods of 0.83 mm ± 2.1 mm (95% CI 0.55–1.11).

The second outcome analyzed in our study was diagnostic accuracy. The BLU-DAT proved to be very good in terms of accuracy for diagnosis of ACL rupture, with a diagnostic accuracy of 98.7%. Also, for this outcome, the best results were observed when performing the test at MMT.

This agrees with previous results reported with other devices. In a meta-analysis by van Eck et al. [25], several arthrometers (KT 1000, Stryker Knee Laxity Tester and Genucom Knee Analysis System) showed better sensitivity, specificity and overall accuracy at MMT than at lower loads.

A similar finding was found by Klouche et al. [8] with GNRB, where accuracy was proportional to the force applied, except for the maximum force they applied (250 N) and, as explained by the authors, this may be caused by the patients exceeding their pain threshold. In our test series, this type of reduction in compliance due to excessive traction never occurred.

In terms of diagnostic accuracy, a new automatic knee arthrometer (AKA) recently introduced by Niu et al. [16] is noteworthy, which allows minimizing the examiner's bias by means of an automatically performed thrust. The authors found higher accuracy than KT 2000 (sensitivity: 86% vs 83%; specificity: 95.5% vs 88.5%), using a threshold of 3 mm as in the study. Wu et al. [28], recently tested a new automatic knee arthrometer (Ligs Innomotion). At a load of 150 N (comparable to the MMT in the present study), the authors found the maximum AUC (0.857) and a sensitivity and specificity of 0.87 and 0.73, respectively. The present study has some limitations. First, the study lacks a direct comparison with another arthrometer that has already been validated and studied. Second, we did not assess ATT in a postoperative setting, therefore external validity of the study cannot be extended to the use of the device for assessment of surgical treatment at clinical follow-up. Finally, the investigators were blinded to the measurements of the other investigator, but for practical matters, they were aware of the group to which the individual patients belonged. The Blu-DAT can be routinely used in the outpatient setting to confirm the suspicion of an ACL tear.

Conclusions

The BLU-DAT has proven to be an instrument with good intra- and inter-observer reliability and very good accuracy in the diagnosis of ACL injuries in the outpatient setting. MMT loading condition provided the best reliability and accuracy.