1 Introduction

Pompe disease, acid maltase deficiency (AMD), is also referred as glycogen storage disorder (GSD) or glycogenosis II. It is an autosomal recessive human disease caused by deficiency of the enzyme acid alpha glucosidase (GAA), leading to accumulation of the lysosomal glycogen especially in the heart, skeletal and smooth muscle, as well as the nervous system [1]. Glucose, the simple and essential sugar supply to generate energy and maintain human activities, can be picked up from all kinds of food and predominantly stored as glycogen in the liver, adipose tissue, and muscles. Normally, human intakes foods and our cells use the glucose for adenosine triphosphate synthesis. Excess glucose can be stored in the form of glycogen. Abnormal GAA activity makes the glycogen unable to be decomposed and leads to its persistence accumulation. In that case, glycogen accumulates with the lysosomes of multiple tissues, particularly in liver and muscles, resulting in hepatomegaly and muscular weakness. Clinically, two classical phenotypes are the infantile and late-onset types according to the age of onset and the impact on the critical organ functions. Infantile type occurs in early childhood. In contrast to the infantile type, the late-onset type has a heterogeneous presentation that makes an early and precise diagnosis rather difficult. Although the genomic analysis is feasible for recognizing the late-onset type AMD, their escape of occurrence during childhood and uncertainty of onset time in the future make the regular follow up routinized and necessary [1].

The treatment for both types of AMD patients is currently based on the enzyme replacement therapy (ERT) and supportive ones. The ERT is available since 2006, using of external supplement of Alglucosidase alfa (Myoenzyme) [2] to make glycogen capable to be successfully decomposed back to glucose, and resolve the accumulation of persistent glycogen problems. The ERT is effective to relieve the clinical symptoms of AMD patients [3]. The ERT has a higher effect on patients with less severe symptoms and therefore suggest to start treatment early, before muscle damage becomes possibly irreversible. On the other hand, the earlier the start of ERT has the better prognosis outcome. Therefore, the early and precise diagnosis for AMD patients is important [4, 5].

In Taiwan, the health insurance system has provided the ERT for AMD patients since 2005. In addition to perform the genetic screening for Pompe parents during mother’s pregnancy, there is neonatal blood screening for early detection of infantile type babies [6, 7]. However, as for those of late-onset type, a confused clinical recognition with other muscular-related diseases may lead to poor efficacy or even lack of ERT. Therefore, from the clinical viewpoint, it is essential to non-invasively monitor via active and continuous tracking of the muscle function and to establish an easy-to-use approach for accessing the goal of early precision detection and treatment.

The rapid advances in medical image techniques such as MRI [8], CT [9] and ultrasonography, combined with medical computer sciences, have greatly strengthened and enhanced the capacities of both diagnostic detection and dynamic monitor of tiny muscle lesions and progresses of human diseases [9,10,11,12]. Among the medical image resources, ultrasound is the safest, non-invasive, less expensive and readily available tool to investigate muscle disorders [12,13,14,15]. Therefore, ultrasonography examination of the skeletal muscle has the potential to assess the muscle disorder conditions, especially for the diagnostic screening of infants and children susceptible to Pompe disease.

Creatine kinase, also known as creatine phosphokinase or phospho-creatine kinase, is mainly distributed in the skeletal muscle, the myocardium and brain tissues. Phosphocreatine is a phosphorylated creatine molecule that serves as a mobilizable reserve of high-energy phosphates in skeletal muscle and brain tissues. In other words, phosphocreatine plays an important role in tissues that have high, fluctuating energy demands such as the muscle and the brain. Under normal circumstances, most of these enzymes were stored in brain, skeletal muscle and heart tissues. However, high amounts of creatine kinase are released into the blood when muscle is damaged [16, 17]. Therefore, creatine kinase levels in the blood may be used to measure the severity of muscle damage.

Ultrasound can provide real-time diagnostic information of the anatomical region of interest (ROI) and help to identify or validate the diagnosis of many skeletal muscle related issues. Ultrasonography, firstly used to investigate muscle in 1980, has evolved into a valid technique for the diagnosis of muscle diseases [10]. In literature, it has been shown that neuromuscular diseases, affecting the nerves that control voluntary muscles, changed the thickness or performance of the muscles under ultrasonography [13, 14, 18,19,20,21]. There are studies compared skeletal muscle ultrasound images to identify the presence and specific patterns of AMD pathology and to measure the disease severity. They also measured the amount of brightness of the ultrasound signal [22] or calculated the echo intensity by standard histogram [23] to evaluate the skeletal muscle ultrasound. Furthermore, Zaidman et al. [22] reported that both qualitative (Heckmatt rating scale) and quantitative (backscatter) ultrasound measures increased with decreasing strength and function for AMD patients.

To sum up, the main purpose of this study is to assess the ultrasound skeletal muscle images of AMD and non-AMD subjects, and establish the ultrasonography-based qualitative and quantitative evaluation approach for Pompe disease.

2 Materials and Methods

2.1 Subjects and Ultrasound Image Acquisition

The institutional review board at Taipei Veterans General Hospital (Taiwan) approved the study (IRB No. 2015-8-008B), and written informed consents were obtained from all investigated subjects. In this study, we enrolled 37 participants. Among 37 subjects enrolled for the study, the minimum age was 1 year with maximum of 19 years. Age distribution of 37 enrolled subjects are displayed in Fig. 1. From July 2014 to December 2015, a total of 37 cases were studied. Of these, the first group was the control group (Group I, n = 14) with 14 healthy subjects ranging from 1 to 15 years of age; the second group (Group II n = 8) enrolled 8 consecutive patients which were confirmed by genetic testing for AMD late-onset type but without clinical symptoms (i.e. without ERT), aged 1 to 7 years; the third group (Group III, n = 15) enrolled 15 consecutive patients which were also confirmed for AMD and were diagnosed with abnormal levels of α-glucosidase level, muscle weakness and other clinical symptoms. Among them, 2 cases were late-onset type and 13 cases were infantile type, aged from 1 to 19 years. All subjects in Group III were treated with ERT immediately as their clinical symptoms were diagnosed. All subjects were evaluated by the same expert pediatrician. Although the myocardium is the first affected organ, the myocardial examination is relatively aggressive to young patients and unusually performed. In addition to the myocardium, the lower limb is the most obvious. On the contrary, quadriceps for the lower limb walking behavior control is rather important, easy to scan and non-invasive, so the rectus femoris muscle was selected as the main experimental object in this study. This is because the rectus femoris muscle, anatomically located in the middle of the front of the thigh (i.e., ultrasound scans can easily be obtained of the contralateral muscle for comparison purposes), is one of the four quadriceps muscles [24]. The rectus femoris muscle is anatomically located at the superficial, which is easy to assess by the radiologist. In other words, the rectus femoris muscle can be easily to scan in clinical practice. Physiologically, the rectus femoris muscle is responsible for knee extension and hip and thigh flexion. It is also the main muscle that can flex the hip. The rectus femoris muscle flexes the leg at the hip joint level and extends at the knee joint. Pompe disease is a metabolic myopathy with a wide spectrum of clinical presentation. Ultrasound has recently become the imaging tool of choice for assessing musculoskeletal diseases due to its advantages of easy accessibility, availability, lack of radiation exposure, and low cost [25,26,27,28,29,30,31]. B-mode ultrasound is particularly well suited for evaluation of neuromuscular disorders because it can be performed at the bedside, can rapidly evaluate multiple body regions, and is well tolerated by both children and adults. Furthermore, some studies showed that the intra and interrater reliability of measurements of echo intensity of the rectus femoris muscle of subjects is high [27,28,29,30]. Therefore, it is important to assess the characteristics of the rectus femoris muscle for the lower limb walking behaviors, especially for the patients with musculoskeletal diseases.

Fig. 1
figure 1

Age distribution of 37 enrolled subjects are displayed

Transverse ultrasound images were obtained by the B-mode US imaging device (ACUSON S3000 Ultrasound System, HELX Evolution, Siemens Medical Solutions, Mountain View, CA, USA) with 9L4 Transducer (Linear array). The acquisition parameters of the imaging device were kept constant during all images acquisition. The acquired ultrasound image matrix size was of 1024*768. The acquisition parameters of the B-mode US imaging device are described below: 1.0 mechanical power index; frame rate at 53–46 frames per second; gain dynamic range of 65 dB; central frequency of 9.0 MHz; and focal zone of 3.5–4.5 cm depth. Patient ultrasound image quality is associated with the subject stress (i.e. contraction/relaxation), so it is critical to ensure that all subjects are in relaxation during image acquisition. The ultrasound examination was carried out with each subject relaxed in a supine position, with the arms and thighs were relaxed and extended on both lateral sides of the body. Since the ultrasound grayscale images are the main study issue, it is necessary to keep the same scan parameters for each scan. In this present study, only the scan depth setting can be adjusted to acquire the appropriate size of the muscle. The transducer was positioned perpendicular to the longitudinal axis of the quadriceps femoris. One radiologist placed the probe gently and vertically on the proximal end of the thigh, and scanned to the top of the knee to observe the entire rectus femoris muscle on both sides of the thighs for each subject. All the ultrasound images were stored in tagged image file format (TIFF) and 74 images were evaluated in this study. The patient muscle strength (i.e. expression with ultrasound echo intensity) is changed with the patient posture. The limitation of this study is that the AMD patient should be relaxed in a quiet room at a comfortable temperature and a short period of rest should precede the ultrasound examination with patient rest (i.e. relaxation). The flowchart of qualitative and quantitative analysis in this study is shown in Fig. 2. All images were evaluated using the proposal four-grade qualitative grading (as listed in Table 1, referring to the Heckmatt criteria) and quantitatively evaluated with the region of interest (ROI) with the echo intensity and the segmented area.

Fig. 2
figure 2

The flowchart of the processes of establishing the ultrasonography-based qualitative and quantitative evaluation approach for Pompe disease

Table 1 Criteria for proposed qualitative grading in this study

2.2 Image Preprocessing

Anatomically, the rectus femoris muscle is the most superficial one of the four quadriceps muscle, and it is situated in the middle of the front of the thigh. Avoiding the influence of the high echo intensity of the epimysium (i.e. the sheath of connective tissue around a muscle) on the image analysis, the rectus femoris muscle of the ultrasound image as the ROI extracted manually from the original scanned images was performed with the polygon selection function of the Java-based image processing program ImageJ (National Institute of Health, Bethesda, Rockville, MD, USA). In order to be able to conform the shape of ROI as much as possible, each ROI must contain at least 70 up to 80 selected points. Since this study used the retrospective ultrasound image data, some of these ultrasound images have been adjusted to the optimum level by the time gain compression (TGC) or the total gain according to the diagnostics quality. However, to minimize the possibility that all these acquired images with different gains being analyzed qualitatively under unfair conditions, it is necessary to use the echogenicity reduction factor (ERF) [32] to convert the echo intensity from the different gains to the same reference value. In this study, we scanned the same location of the ultrasound tissue-equivalent phantom with varying different gains and obtained 41 ultrasound images. As the scanned gain ranging from − 20 to 20 dB, a total of 41 grayscale images were obtained.

Aperture focusing is characterized by the f-number, which is defined as the ratio of the focal length to the size of the aperture along the row of crystal elements. Optimal focusing occurs with a f-number of 2. Generally, many manufactures employ combinations of time-delayed firing and changing beam aperture to optimize focusing at different depths. Furthermore, the normal trade-off is a decrease in frame compose the image unless parallel processing is used. In this study, the focal depth in phantom experiments was fixed to a depth of 2 cm (i.e. the focal depth is close to the scan depth located with the rectus femoris muscle in our AMD patient ultrasound examinations) and adjusted gain values from − 20 to 20 dB to acquired 41 images. We selected the lesion region in the images as the ROI and calculated the average intensities of the ROIs [i.e. lesions in the breast phantom (CIRS Breast Elastography Phantom, phantom model 059, Zerdine®Hydrogel, Computerized Imaging Reference Systems, Norfolk, VA, USA)], as demonstrated in Fig. 3. The aperture of the ultrasound phase array was about 3.65 cm. The f-number in the phantom experiments was about 1.825. We used the average intensity of the ROIs of 41 images for determining the conversion intensity of the curve fitting equation. To consider the effects of the different gains on the B-mode echo intensity, we selected 0 dB was the reference conversion gain and converted the patient image intensity (i.e. with different depths and gains) into the equivalent conversion intensity at 0 dB using the curve fitting equation in this study.

Fig. 3
figure 3

Ultrasound images for different values of gain in phantom experiments. The enclosed regions with the yellow lines was the ROIs for determining the curve fitting equation. All focal depth in the phantom experiments was set at 2 cm. a − 20 dB, b − 15 dB, c − 10 dB, d − 5 dB, e 0 dB, f 5 dB, g 10 dB, h 15 dB, and i 20 dB. The average intensities of the ROIs for ac; df; and gi were 0.111, 4.672, 14.220; 32.283, 50.853, 68.195; and 93.466, 115.978, 135.236, respectively

We calibrated each subject’s ultrasound image by determining the phantom grayscale curve-fitting equation, which converts the diagnostic ultrasound image to describe the relationship between the different grayscale images and each phantom ultrasound image. A total of 41 images collection, and the average grayscale presented a nonlinear curve. The cubic curve fitting equation of the original nonlinear grayscale curve is

$${\text{I}}\left( {\text{G}} \right) = - 0.00155 \times {\text{G}}^{3} + 0.04427 \times {\text{G}}^{2} + 4.026 \times {\text{G}} + 49.89$$

where G is the gain of image (Fig. 4).

Fig. 4
figure 4

The average intensity from the phantom experiments and the conversion intensity calculated by the curve fitting equation were displayed. The correlation coefficient between these two intensities was 0.9994

Here, we use 0 dB as the conversion reference value in this study, and the ERF for each gain value was \({\text{ERF}} = \frac{{{\text{I}}_{{0{\text{dB}}}} }}{{{\text{I}}_{\text{IN}} }}\). With this formula, the grayscale values under the different ultrasound scanned conditions (i.e. different gains) can be converted back to the equivalent intensity at 0 dB.

We also analyze the effects of changes in ultrasound systems and parameter settings such as alteration in gain, calibration curves, transducer frequency and variations in ROI size of either fixed or varying depth. This was done by the measuring the grayscale of multiple images across a spectrum from − 20 to 20 dB.

2.3 Qualitative Analysis

Referring to Heckmatt et al. [11] grading scale based on changes in muscle and bone echo of patients with the neuromuscular disease, in this study we compared the echo intensity (i.e. echo intensity is proportional to the muscle strength) of the rectus femoris muscle with that of its surrounding fat (subcutaneous fat) and proposed a qualitative grading method (as shown in Table 1). In this study, we proposed a qualitative grading method with the image-based grayscale comparison between the rectus femoris muscle and subcutaneous fat for subjects (as show in Table 1). The acquisition parameters were kept constant during all images acquisition for each ultrasound scan. We compared the ultrasound grayscale difference of the rectus femoris muscle with that of its surrounding fat. The qualitative grading was divided into four grades (Fig. 5). The brightness of each dot corresponds to the echo strength. Grade 0 and grade 1 were normal, but grades 2 and 3 were abnormal muscle performance. All subject images were randomly blinded grading for diagnosis.

Fig. 5
figure 5

Qualitative ultrasound assessment using the proposed qualitative grading (Table 1) in this study. Abbreviations: subcutaneous fat (SF) and rectus femoris muscle (RFM). Four grades of ultrasound images are displayed. a Grade 0; b Grade 1; c Grade 2; and d Grade 3. Grade 0 was defined as ultrasound echo intensity of the rectus femoris muscle lower than that of its subcutaneous fat; Grade 1 was defined as echo intensities of the rectus femoris muscle and its subcutaneous fat with no distinct difference; Grade 2 was defined as echo intensity of the rectus femoris muscle slightly higher than that of its the subcutaneous fat; and Grade 3 was defined that the echo intensity of the rectus femoris muscle was significantly higher than that of the subcutaneous fat. Grades 0 and 1 were normal, and grades 2 and 3 were diagnosed with abnormal muscle

Figure 5 shows the four grades of ultrasound images. Grade 0 was defined as ultrasound echo intensity of the rectus femoris muscle lower than that of its surrounding fat; Grade 1 was defined as echo intensities of the rectus femoris muscle and its surrounding fat with no distinct difference; Grade 2 was defined as echo intensity of the rectus femoris muscle slightly higher than that of its the subcutaneous fat; and Grade 3 was defined that the echo intensity of the rectus femoris muscle was significantly higher than that of the subcutaneous fat. Grades 0 and 1 were normal, and grades 2 and 3 were diagnosed with abnormal muscle. In this study, the qualitative grading of ultrasound images was performed by the same medical radiologist.

2.4 Quantitative Analysis

The quantitative analysis focused on the area ratio of the image changes and the overall echo intensity values. It was done by measuring the quantity of abnormal echo signal reflected back to the transducer to predict local or overall pathological changes of the muscle.

2.4.1 Quantitative Image Area Ratio

First of all, the suspicion areas must be segregated from the images. In the field of medical image processing, there are lots of typical automatic image segmentation algorithm being used, such as region-based segmentation, edge-based segmentation, and deformable models. Since the need for this study was to extract anomalous intensity from the entire image, the concept was similar to region-based segmentation, as shown in Fig. 6.

Fig. 6
figure 6

The flowchart of the proposed quantitative analysis in this study. a polygon selections with the outline of the rectus femoris muscle for medical analysis, where the ROI selection in the axial plane; b the region with the zero value of the grayscale background (outside of the region of the rectus femoris muscle i.e. black); c initial seed point (white circle point) (i.e., start of growing a region); d growing region (white square block) (i.e., growing process after a few iterations); e the algorithm of the region growing; f segmented region after the region growing; g the high echogenicity region ranging from 110 to 254 (red zone); and h The analysis region was segmented by removing the region of the grayscale ranging from 110 to 254 (red zone) for quantitative analysis in this study

Transverse ultrasound image plane of the rectus femoris muscle as the ROI (i.e. image segmentation) was extracted manually from the original ultrasound images (Fig. 6a). The image processing was the manually enclosed region selection of the rectus femoris muscle (i.e. ROI) using the polygon selections with at least 90 points to cover the ROI using the software program “ImageJ”. Meanwhile, the background grayscale value of outside the region covered the rectus femoris muscle was set to be zero (as shown in Fig. 6b) for analysis. We focused on the echo intensity of the region of the rectus femoris for the quantitative echo intensity analysis. Each B-mode image must be performed with the same image processing (ROI selection. The gray value of the outside of the ROI involved with the rectus femoris muscle was set to zero (i.e., zero background grayscale value setting) for analysis. The region growing is a region-based image segmentation method. This method is also classified as a pixel-based image segmentation method since it involves the selection of initial seed points. Moreover, this approach to segmentation examines neighboring pixels of initial seed points (as displayed in Fig. 6c) and determines whether the pixel neighbors should be added into the region (as demonstrated in Fig. 6d). Furthermore, the region-based image segmentation creates regions by grouping together similar pixels with higher accuracy than pixel-based segmentation algorithms. In this study, we used 4-connected neighborhood to grow from the seed points and its algorithm (as displayed in Fig. 6e). And the criteria we make the same pixel value to keep examining the adjacent pixels of seed points. If they have the same intensity value with seed points, we classify them into the seed points. Finally, it is an iterated process until there are no change in two successive iterative stages. The process is iterated on, in the same manner as general data algorithms. After the growing region processing (Fig. 6f), the ROI involved the reflected high echo intensity from the perimysium tissues (i.e., a sheath of connective tissue), which caused the overestimation of the segmentation area of the ROI (To reduce the influence, we used the image thresholding (i.e., image segmentation processing) to remove the overestimation area of high echo intensity from the perimysium tissues (Fig. 6g, h).

In order to exclude the influence of the area of abnormal intensity, we removed the area of the perimysium with high echogenicity. By the rule of thumb, the area of perimysium was removed with a threshold setting of 110. In order to avoid the calculation of the background, which was 255, only the range of grayscale below 254 can be computed. The true abnormal region was the area after region growing deduct the area which grayscale was between 110 and 254 (Fig. 6h). The proportion of abnormal area extracted from the ROI was regarded as the degree of variation. The larger the value, the greater the abnormal area are. On the other hand, the smaller the value, the less the abnormal echo area.

2.4.2 Quantitative Intensity of ROI

Because the characteristics of the segmented area were about the same, the echo intensity of the region are similar and had no differences. However, the pathological changes of muscle in AMD patients were often diffuse or a piece of non-uniformity. Therefore, this study shows the degree of variation by comparing the overall echo intensity of ROI. The larger the value, the higher internal echo changes; the smaller the value, the lower internal echo changes.

3 Results

Seventy-four ultrasound images (including the control group with 14 healthy subjects) of the bilateral rectus femoral muscles from 37 subjects were evaluated. Furthermore, we used the general linear model to describe the relation between the patient age, image gray level, scan depth, and focal depth variables and to decide whether that relationship is statistically significant. For the total images (i.e., 37 subjects with 74 images (i.e., one subject has each side image of the thigh)), regression analysis revealed that age has a statistically significant effect on the gray level (P = 0.001), and the scan depth (both P < 0.01), but not on the focal depth (P = 0.08), as displayed in Table 2. Both gray level and scan depth increase with age. To test the differences in the gray level and scan depth between groups (control and Pompe disease), general linear model was used, with age and group as the independent variables. After removing the age effect, both gray level and scan depth are significantly different between groups for the total image analysis. The gray level in the Pompe group was significantly larger than that in the control group (P < 0.001), while the scan depth in the Pompe group was significantly smaller than that in the control group (P < 0.001). While analyzing the left and right side of the patient thighs separately, the differences were still significant. The detailed information is displayed in the following Table 3.

Table 2 Statistical information with image gray level, scan depth, and focal depth variables
Table 3 Statistical information with image gray level, scan depth, and focal depth variables

Qualitative results were shown as (Table 4): Group I (n = 14) had the highest distribution of Grade 1 and the absence of Grade 2 and Grade 3; Group II (n = 8) was mostly in Grade 1; and Group III (n = 15) had the highest distribution of Grade 2 and the lowest distribution of Grade 0.

Table 4 Qualitative grading of 37 subjects with 74 ultrasound images of the rectus femoris muscles

Among 23 AMD patients [i.e. including no clinical symptom (n = 8) and undergoing ERT (n = 15)], Group III had the largest (43.37 ± 27.92) and the most distributed segmented area, ranging from 5.48 to 91.60; Group II (n = 8) had 19.04 ± 16.90, ranging from 3.76 to 57.54; and Group I (n = 14) had the smallest (13.42 ± 12.45), ranging from 3.81 to 64.46.

We used the nonparametric tests on two independent samples to evaluate the statistical differences among three groups. The statistical results showed that there were significant differences between Group II and Group III, and between Group III and Group I (i.e. the control group) (p value < 0.05), respectively. Meanwhile, there was no statistical difference between Group II and Group I (p value = 0.526).

Among the 74 images [i.e. Group I (n = 14); Group II (n = 8); Group III (n = 15)], Group III (i.e. AMD patients undergoing ERT) had the strongest and widest echo intensity of the segmented ROI which were 49.68 ± 16.23, ranging from 16.44 to 85.13; Group II 41.33 ± 12.69, ranging from 21.56 to 61.35; and the control group were 37.17 ± 10.58, ranging from 22.12 to 63.35.

We have also analyzed the statistical differences in intensity of ROI among these three groups by nonparametric two independent sample tests. The intensities of ROI between Group III and Group I were statistically different (p value < 0.05). Meanwhile there was no significant difference not only between Group II and Group III (p value = 0.08), but between Group II and Group I (p value = 0.38).

Furthermore, there was a weak correlation between the intensity of ROI and creatine kinase levels (r = 0.33, p value = 0.025); a moderate correlation between the intensity of ROI and accumulation months undergoing enzyme treatment (r = 0.398, p value = 0.029); a strong correlation between segmented area and the intensity of ROI of all the images (r = 0.878, p value < 0.0001) (Fig. 7); but no correlation between size and intensity of the segmented area (p value > 0.05).

Fig. 7
figure 7

Correlation between segmented area and the echo intensity (gray value) of ROI of all the images

In order to predict the condition of the disease with the size and intensity of the segmented area, we used a logistic regression analysis in this study. The statistical regression equation was Y = 0.067*X1 + 0.206*X2 − 16.685, where X1 represents the proportion of the segmented area, X2 is the intensity of the segmented area, and Y indicates the condition of the disease. The Chi square value of the pattern was 36.536, p value < 0.05. The odds ratios of X1 and X2 were 1.070 (95%CI 1.026–1.115) and 1.229 (95%CI 1.080–1.400), respectively.

The numerical data were constructed under the ROC curve of those methods and shown in Fig. 8. Refer to the Youden’s index to figure out the best cut-off value for differential diagnosis of quantitative values between Group III (AMD patients undergoing ERT) and Group I (control group). The cut-off value of intensity of ROI was 48.6, when the echo intensity was equal to or greater than 48.6. The sensitivity was 53.3%, the specificity was 89.3%, the positive predictive value was 84.2%, and the negative predictive value was 64.1%. In addition, the cut off value of segmented area was 16.9. Meanwhile, the proportion of segmented area was equal to or greater than 16.9. The sensitivity was 80.0%; the specificity was 82.1%; the positive predictive value was 82.8%; and the negative predictive value was 79.3%. The sensitivity of the qualitative analysis was 63.3%; the specificity was 100%; the positive predictive value was 100%; and the negative predictive value was 71.8%.

Fig. 8
figure 8

The ROC curve of quantitative and qualitative evaluation. a The area under curve of the intensity of ROI was 0.738 (green line), of proportional segmented area was 0.854 (blue line). b of qualitative was 0.817

4 Discussion

4.1 Qualitative Analysis

In 1982, Heckmatt et al. [11] developed a qualitative grading scale based on changes in muscle and bone echo of patients with neuromuscular disease. In this study, we proposed an ultrasonography-based qualitative approach [i.e. using the intensity difference between the rectus muscle and the subcutaneous fat (Table 1)] to monitoring the severity of muscle destruction for AMD patients. The subjects with AMD and non-AMD in this qualitative study were studied, in which the subject age range was from 1 to 19 years old, only two AMD patients were more than 15 years old, and the others were less than 10 years old. Thus, the approach is extremely practical in early detection for AMD identifying and sequential personalized therapy, not only for the infantile type patients but for these of late-onset type. For most cases, the course of disease had not yet affected the echo of the femoral bone. Therefore, redefining the assessment criteria in this study was necessary, comparing the echo of muscle with the surrounding fat. After the expert’s classification, the sensitivity of qualitative was 63.3%: the images of Group III (AMD patients undergoing ERT) were identified as abnormal and accounted for 63%, images of Group II (AMD patients without ERT, i.e. AMD without clinical symptoms) were identified as abnormal were 31%, and Group I (the control group) were all defined normal. In the classification of abnormal images, the lightly raise of grayscale was as Grade 2. The AMD patients undergoing ERT accounted for 73%, AMD without clinical symptoms accounted for 27%; server raise of grayscale (in this study was classified as Grade 3), the AMD patients undergoing ERT accounted for 89%, AMD without clinical symptoms accounted for 11%. The correlation of the qualitative evaluation deteriorated when the images where analyzed by an inexperienced observer (kappa 0.43). Inter- and intra-reliability of using ultrasound images was determined with Cohen’s kappa. In this study, the values for Cohen’s kappa (κ) of inter-rater reliability (sometimes called interobserver agreement) and intra-reliability in the qualitative analysis were 0.48 and 0.74, respectively. So, the inter-rater agreement for qualitative grouping (i.e. categorical items) is moderate; the intra-rater agreement is substantial. The value for Cohen’s kappa of intra-reliability was 0.74 in this study. Thus, the results indicated that the grayscale differences between muscle and surrounding fat can be used to monitor the changes of the muscle for AMD patients.

In our previous study [33], we retrospectively evaluated 43 Pompe disease patients with muscle ultrasound to establish a system of severity grading and compared with clinical evaluations (i.e. serum levels including alanine transaminase (ALT), aspartate transaminase (AST), creatine kinase (CK), and lactate dehydrogenase (LDH)). Thirty-five of these patients were with initial low acid alpha-glucosidase (GAA) and compared with muscle ultrasound features, GAA gene mutation, activity/performance, and muscle biopsy pathological and laboratory findings. We found that the proposed muscle ultrasound grading of Pompe disease patients was correlated well with patient serum levels of muscle-injury biochemical markers. In this current study, all ultrasound images were visually evaluated using the referred to Heckmatt criteria for qualitative analysis and quantitatively evaluated with computer-assisted echo intensity analysis of the rectus femoris muscle. In other words, we not only compared ultrasound image brightness between the rectus femoris muscle and its surrounding fat to grade the four types with referring to Heckmatt scale grading concept, but also used the echo intensity distribution of the segmented region encircled the rectus femoris muscle to characterize the diagnostic phase of AMD patients with/without undergoing enzyme replacement therapy (ERT).

It is known that the neuromuscular-related human diseases usually have a manifestation of an increase in the performance of the muscle tissue in the ultrasound image, and the alterations in muscles on the images was inhomogeneous or with spotted speckles [34]. So frequently it is difficult to use qualitative method to make judgments equitably. There are classified norms, but each judgment has different degrees of empirical accumulation and subjective consciousness and cannot be unified [35,36,37]. To conquer this weak point, it is reasonable for us to use computed assessment methods to establish qualitative image evaluation approach in this study.

4.2 Quantitative Analysis

For all investigated muscles in this study, the small numbers of cases included 14 healthy subjects (ranging from 1 to 15 years of age), 8 consecutive patients (aged 1 to 7 years) confirmed by genetic testing for AMD late-onset type but without clinical symptoms, and 15 consecutive patients (aged 1 to 19 years) diagnosed with AMD and treated with ERT, as group I and group II, and group III, respectively. The statistics result shows age has a statistically significant effect on the gray level for the total images (both sides) for considering the age variable. On the other hand, the p values of gray level were 0.001 and 0.0001 for considering with age and without age, respectively (Tables 2 and 3). The gray levels for both sides, left side, and right side are statistically significant when considering with/without age. The distribution of left side and right side with age is demonstrated in Fig. 9. For left side images, age has a statistically significant effect on the gray level, but not for the right side. The scatter plot shows that the gray level has an upward trend for each side when age increases. In some situations, the p values are different in both sides, left side, and right side. The reasons may come from the following factors. First, the small number (n = 37, with three groups) of investigated cases would significantly affect the statistical results with/without the age variable. Second, the variation ranges of age for three groups are also different, such as group I (aged 1 to 15 years), group II (aged 1 to 7 years), and group III (aged 1 to 19 years). The number of older subjects aged larger than 15 years is just two (one is 16 years old; the other is 19 years old), which directly displays that the left side is statistically more significant than the right side. Furthermore, we found that two subjects, aged 16 and 19 years, have higher gray levels in the left side than that of the right side. Meanwhile, we can also observe this influence in Fig. 9 (as displayed in Tables 2, 3). Third, in some cases the scan plane (i.e. referring to scan depth) cannot cover the entire rectus femoris muscle for either side when using a fixed scan depth. To completely cover the rectus femoris muscle, we adjusted the scan depth for acquiring the ultrasound image. In this situation, the scan depths are different in either side for the same subject. This scan depth may affect the statistical results for either side. As in the vertical direction, the original ultrasound image contains skin, subcutaneous fat, rectus femoris muscle, vastus intermedius muscle, femur bone, and shadowing of bone in this study. Generally, the muscles become larger with age. The larger the rectus femoris muscle is scanned, the deeper the scan depth will be theoretically. However, in some situations the scan plane cannot contain the only rectus femoris muscle because we do not keep the distance between the rectus femoris muscle and its posterior of muscle in ultrasound experiments. In this study, the scan depth ranged from 3.5 to 6 cm, as shown in Fig. 10. In addition, the focal depth varied from 1 to 3 cm, as displayed in Fig. 11.

Fig. 9
figure 9

The distribution of gray levels and age in all participants (n = 37) with left and right sides is displayed

Fig. 10
figure 10

The distribution of scan depth and age in all participants (n = 37) with left and right sides is displayed

Fig. 11
figure 11

The distribution of focal depth and age in all participants (n = 37) with left and right sides is displayed

This study found that the area of the muscle changes in ultrasound image of AMD patients were different, which represented the heterogeneity alterations very similar to those results reported by Maurits et al. in quantifying the heterogeneity of Duchenne muscular dystrophy [38]. Significant difference was found when analyzing the area of the muscle alterations between the groups of AMD patients undergoing ERT versus AMD without clinical symptoms, as well as versus control group. This may indicate the use of comparative muscle alterations in the size of the population can analysis the status of AMD, and then early classification of patients out of the disease.

With this ultrasonography-based quantitative evaluation approach, it is confirmed that the overall echo intensity of ultrasound images in AMD patients will be significantly increased. This is similar to the qualitative findings of the ultrasonography-based muscle alterations of other studies investigating neuromuscular-related human diseases [39, 40]. It is important to note that there was a significant difference between Group I (healthy subjects) and Group III (AMD undergoing ERT) using the quantitative segmented area. Meanwhile, there was no difference between Group I (healthy subjects) and Group II (AMD without clinical symptoms). Furthermore, the ultrasound images of AMD patients were affected by some factors such as muscle inflammation, fascia inflammation, muscle non-fracture damage, muscle tension and so on, which results in some noises. We found that the intensity of Group II (AMD without clinical symptoms i.e. without ERT) significantly increased. It may be from GAA activity loss. GAA deficiency will result in neuromuscular and strength disorders. The absence of GAA is the major cause of the profound muscle damage observed in patients with Pompe disease. Thus, the muscle damage of Group II (the AMD patients without clinical symptoms) may be insignificant due to the limited GAA activity. Therefore, there was no statistically significant difference between Group I (healthy subjects) and Group II (AMD without clinical symptoms).

It is obvious that the distribution of GAA deficiency affects the distribution of the muscle damage [1, 22, 41]. Generally, echo intensity on ultrasonography images of skeletal muscle is believed to reflect muscle conditions. For the ultrasonography-based quantitative intensity approach, we noted that the ultrasound echo intensity of AMD patients is associated with the distribution of the muscle damage. Thus, it might reveal that the higher echo intensity has the more severe muscle damage.

We are interested in the question that is there any standard level of the normal people under the quantitative approach based on ultrasound. In 2004, Maurits et al. [38] reported that healthy children had fixed values for the average pixel value (they defined from image density) and uniformity based on the images. This conclusion was consistent with the results calculated in the current study.

The higher the creatine kinase level, the more amount of creatine kinase deposited in the blood, representing more severe the condition of muscle damage [16, 17]. AMD without clinical symptoms were within normal range of creatine kinase level, which could be explained that beside Peabody developmental motor scales of behavior assessment, still take creatine kinase level as a criterion for identifying onset or not, in Taiwan [6, 7]. Accordingly, we are interested that whether the segmented area or the intensity of ROI of this ultrasonography-based quantitative evaluation approach could serve as the index for muscle damage in AMD patients. The correlation analysis showed that creatine kinase values were moderately correlated with the size of the segmented area, but low correlated with intensity of ROI, showing that the extensive of glycogen deposition was more relevant to the effect of depth on muscle. So the use of quantitative size of the segmented area to predict the situation of muscle damage can be better than the quantitative intensity of ROI.

There was also a high level of correlation between the proportion of segmented area and the intensity of ROI (r = 0.878). According to the Youden’s index, the sensitivity of the quantitative intensity of ROI was 53.3% and the specificity was 89.3%. The sensitivity of the segmented area was 80.0% and the specificity was 82.1% (Fig. 7). The reasonable explanation of the difference is that the proportion of the area of the echo anomaly increased and the echo intensity of the ROI improved. The data also showed that the collinearity of segmented area and the intensity of ROI, it is statistically insignificant comparing the analysis results of segmented area with the combination of both segmented with ROI intensity.

Pillen et al. [23] reported a sensitivity of 87% by calculating the muscle echo intensity using standard histogram. The sensitivity was 53.3% of image echo intensity but 80.0% of segmented area in this study. Muscle intensity is focally or diffusely increased. The proportion of the changed area can be used as the degree of echo change. Its sensitivity (80.0%) is higher than the overall image echo intensity (53.3%), and much easier calculation without the standard normal value.

ERT is currently the major therapeutic strategy for AMD patients. This study found that there were moderate correlations between treatment months with proportion of segmented area (r = 0.489) and with intensity of ROI (r = 0.398), as shown in Fig. 7. This indicated that the course of the disease was still in progress, and the intervention of the ERT may simply reduce the cell disruption and slow the progress of the course.

There was no correlation between proportion of segmented area and intensity of segmented area. Since the segmented areas were the outcomes of a serious step of image segmentation procedures, which retain the similar grayscale properties, the echo intensity of the segmented area must be mutually similar, but the size were different. We used logarithmic regression analysis to find out the weight of the proportion and the intensity of the segmented area for the probability of incidence. The odd ratios for proportion of the segmented area and for intensity of the segmented area were 1.070 and 1.229, respectively. The probability was calculated using \(\frac{1}{{1 + {\text{e}}^{{ - \left( {0.067 * {\text{area}}\left( {{\text{reg}}.} \right) + 0.206 * {\text{Intensity}}\left( {{\text{reg}}.} \right) - 16.685} \right)}} }}\) formula and this result was only the possibility of speculation with limited samples in this pilot study. We strongly believe that this formula could be calibrated and improved in its accuracy with enlarged sample size in the near future.

The ultrasonography-based quantitative approach we established here, in addition to typical creatine kinase measurement, seems to contribute significantly to early detection and treatment of AMD. AMD without clinical symptoms had normal creatine kinase level, while the qualitative results showed 31% of the abnormal, quantitative results if with the intensity of ROI of 48.60 cut-off point to do the identification showed 25.0% of the abnormal; if with the segmented area of 16.90 cut-off point to do identification also showed 43.8% abnormal (Fig. 7). Regardless of qualitative or quantitative results, the ultrasonography-based evidence indicated that AMD without clinical symptoms has a rising echo, which means that the change in muscle may take place earlier or be more sensitive than creatine kinase accumulation in the blood, only the cumulative amount was not enough to cause abnormal blood values. For these patients, the most critical thing is the start timing of treatment. Image performance of ultrasound may be earlier than the blood value or clinical behavior, that can be found and determined whether the need for treatment. For AMD patients undergoing ERT, the accumulation of glycogen causes the destruction of muscle is irreversible, the enzyme treatment must be extended all over his life. Our results suggested that the alterations in the ultrasound image can be used to monitor their muscle changes and ultrasonography can serve as an evaluation approach for AMD development and severity with a certain degree of help.

Last but not least, the use of quantitative segmented area to identify changes in the ultrasound image seemed to have the greatest benefit for early detection and prediction of AMD. As expected, it has been pointed out that patients with neuromuscular related diseases had shown that the echo intensity of quantitative images can be more sensitive than qualitative analysis [42, 43]. It is similarly found in the current study. Furthermore, the quantitative analyzing modules in this ultrasonography-based evaluation approach were divided into intensity of ROI and segmented area. The areas under the curves of the intensity of ROI, of segmented area and of overall qualitative analysis were 0.738, 0.854 and 0.817, respectively. All of them were of good diagnostic value. Among them, the quantitative segmented area was the most excellent discrimination module.

5 Conclusions

The pilot study confirmed the use of B-mode ultrasonography-based imaging technology to assess muscle alterations and extended it into the quantitative modules. In the case of AMD, the alterations of muscle ultrasonography take place much earlier than the abnormal expression of creatine kinase level in the blood. The use of quantitative segmented area to predict the occurrences of AMD was more accurate than the use of quantitative intensity, and both of them work better than the qualitative module. It is recommended that ultrasonography should be included not only in follow-up, but early detective and predictive stages of the AMD candidates as early as possible in their childhood. The ultrasonography-based quantitative approach can offer new perspectives for the management of AMD patients.