Introduction

Auscultation is utilized as essential clinical examinations, which has been considered a highly cost-effective screening for detecting abnormal clinical signs from 1800s [1]. In the 2020s, auscultation will still play a pivotal role for the diagnosis in cardiopulmonary diseases, and have a significant impact on quality of life and health care costs [2]. Additionally, recent studies reported that auscultation is a potential diagnostic tool for COVID-19 patients and be applicable for the follow-up tool with noncritical COVID-19 patients [3, 4]. Although the auscultation is non-invasive, rapid and cost-effective screening tool, the result of the examination is highly subjective mostly because the ability to diagnose the acquired sound correctly depends on physicians’ experience and knowledge, which potentially causes inaccurate diagnosis and mistreatment. To address the limitation of the conventional auscultation, the capability of recent stethoscope has been significantly improved, which allows the recording of sounds with digital stethoscope and sharing the recorded sound via wireless communication such as blue-tooth or Wi-Fi [5, 6]. Since the sound can be stored as the digital data, the obtained sound data can be analyzed with computer-assisted technologies, which helps to improve the inter-listener variability and subjectivity. For instance, there have been several papers on artificial intelligence (AI)-assisted auscultation which classifies the pattern of sounds and identifies their abnormalities [7, 8].

As another issue in the conventional auscultation, the examination must be performed face-to-face because the physician needed to place the stethoscope on the patient’s body. Many patients living in nursing facilities or at home due to chronic illness or reduced mobility often do not have health care providers nearby. Also, there has been a demand for telemedicine due to regional maldistribution of medical facilities which causes decreasing the accessibility of medical facilities for patients living in rural area [9]. While, advances in battery technology have led to the development of wireless stethoscope with low-power embedded processors and sensors that allow physicians to examine patients from a distance [6]. Then, auscultation became the applicable screening tool even in remote and home care medicine.

In order to perform the auscultation in the remote care, instead of the physicians, patients themselves or non-medical people such as patient’s parents need to place the stethoscope on their body surface in appropriate positions where the diagnosable quality of sound can be obtained. Particularly, in cardiac examinations, it is required to obtain the sound of four valves and place the stethoscope on each valve precisely [10]. In such the remote care, due to the difficulty of communication between physician and patient, it may be difficult to give the patient the precise instruction to place the stethoscope. Although there is a standardized manual for health care provider to identify the listening area, it is challenging for non-medical professionals to follow the manual easily since there is an individual difference of the body shape. Then, we assume that there is a demand for the navigation of the auscultation considering the individual difference of body shape.

For localizing the auscultation area under compensating the individual differences of body shape, we assume that a surface registration between the patient’s body and a reference model can be utilized. The surface registration in this paper is to compute the correspondence between two surfaces. If both surfaces are registered correctly, the auscultation areas specified on the reference model will be projected on the patient’s body. One of the common known registration pipeline is the Iterative Closest Point (ICP) algorithm that enables to apply an input point cloud data of the surface with an affine transformation in order to fit the targeted surface, resulting in the point cloud being considered rigid [11, 12]. Meanwhile, since the body surface depends on the individuals, the rigid registration with ICP cannot find the accurate correspondence between two surfaces and the input point cloud data may need to be deformed to fit the targeted surface. As an advanced ICP algorithm, non-rigid ICP which can fit the point clouds non-rigidly by using feature points and deformation constrain was proposed [13, 14]. The non-rigid registration has been widely researched in the past decade and can be applied to various dynamic shape reconstruction issues such as human motion capture which is utilized for the applications in VR, AR, and entertainment. Some studies reported to apply non-rigid ICP to the field of biomedical engineering such as tracking of respiratory motion abdominal surface [15, 16]. However, there is no research to apply the non-rigid ICP for localizing the specific region on the body surface such as the auscultation area considering the individual difference of body shape.

In this paper, we propose a non-rigid ICP-based registration method for localizing the auscultation area considering the individual difference of body surface. The proposed system provides the listening position on the patient body by applying the body surface registration between the patient and reference model with the specified auscultation area. Our hypothesis is that several types of the reference model are prepared and selecting the utilized reference model closing to the patient body increases the accuracy of the localization. If the shape deviation between the patient and reference model is significant, the accuracy of the registration with non-rigid ICP may be decreased.

The contribution of this paper is to investigate the feasibility of non-rigid ICP in the body surface registration under varying body shape and establish the registration method for localizing the auscultation area considering the individual difference of body shape through the evaluation of simulation and human trial. We believe that this study is the first for introducing the localization of the auscultation area considering the individual difference of body shape.

This paper is organized as flows. In Methods section, we introduce an overview of the proposed registration system and describe several algorithms adopted in the system. In Simulation and Experiment sections, simulation and experimental results are provided, and the following discussion and conclusion are described in Discussion and conclusion section.

Methods

System overview

The proposed system aims to estimate the position of the four valves that are required for auscultation of the heart: aortic, pulmonary, tricuspid, and mitral valves. The tricuspid, mitral, pulmonary, and aortic valves are located on the left side of the lower sternum near the fifth intercostal space, on the left fifth intercostal apex at the midclavicular line (about 10 cm from the midline), the inner edge of the left second intercostal space, and the right second intercostal space, respectively, and it may be challenging for non-medical professionals to find them correctly [17]. We try to project the position of each valve on the patient body surface from the reference body model by applying the non-rigid registration between both body surfaces. The overall procedure of the proposed system consists of two parts: the selection of reference body model closing to patient body and the body surface registration with non-rigid ICP as shown in Fig. 1. First, the optimal reference model closing to the patient body from among the several prepared reference models is selected. With the selected reference model, the body surface registration with non-rigid ICP is performed and the position of each valve is projected on the patient body from the reference model. The detail of each is described in the following sections.

Fig. 1
figure 1

Pipeline of the non-rigid ICP registration with reference model similar to patient body

Selection of reference body model closing to patient body

For selecting the optimal reference body model closing to patient body, we measured the similarity between both bodies by overlapping the point cloud data of both bodies and calculating the degree of overlap. Chamfer Distance has been broadly adopted metrics for measuring the similarity between two point sets and be defined as below.

$$ \begin{aligned} & d_{CD} \left( {S_{1} ,S_{2} } \right) = \mathop \sum \limits_{{x \in S_{2} }} \mathop {\min }\limits_{{y \in S_{2} }} \left\| {x - y} \right\| + \mathop \sum \limits_{{y \in S_{2} }} \mathop {\min }\limits_{{x \in S_{1} }} \left\| {x - y} \right\| \\ & S_{1} ,S_{2} \in {\mathbb{R}}^{3} \\ \end{aligned} $$
(1)

\({S}_{1}\) and \({S}_{2}\) are subsets of point cloud data. x and y are point data contained in \({S}_{1}\) and \({S}_{2}\). \({d}_{CD}\) represents Chamfer Distance between \({S}_{1}\) and \({S}_{2}\). This is computed by summing the squared distances between nearest neighbor correspondences of two point clouds. When the shape deviation between both point cloud is large, the distance between nearest neighbor correspondences of two point clouds is large and then the Chamfer Distance is increased. By computing the Chamfer Distance between the patient body and each of the prepared reference body models, most similar reference body model is identified.

For computing the Chamfer Distance, the subsets of point cloud of different shapes and arrangements need to be overlapped roughly first. Then, ICP registration is utilized to align both subsets roughly based on local features of point cloud. In each of subsets, Fast Point Feature Histograms (FPFH) feature is calculated and the correspondence of the feature between the subsets is searched with RANSAC.

To all reference body models, the Chamfer Distance is calculated as the similarity to the patient body after applying the ICP registration and the most similar reference body model is utilized for the Non-rigid ICP process described in the following section.

Non-rigid iterative closest point for body surface registration

In order to deform the selected reference model fitting to the patient body, the non-rigid ICP is applied. The non-rigid ICP utilized in this study is a partially modified version of algorithm proposed by Amberg et al. [14]. The non-rigid ICP is applied to the mesh data converted from the point cloud data in this study. The source mesh converted from the point cloud of the reference model is given as a set of n vertices \(\mathcal{V}\) and a set of m edges \(\mathcal{E}\). The registration performs finding parameters X representing a set of displaced source vertices \(\mathcal{V}({\varvec{X}})\) which is the deformed mesh to the targeted patient body surface. For finding X, the cost function E was defined as below:

$$ E\left( {\varvec{X}} \right): = E_{d} \left( {\varvec{X}} \right) + \alpha E_{s} \left( {\varvec{X}} \right) + \beta E_{l} \left( {\varvec{X}} \right) $$
(2)

where Ed, Es and El represent distance cost function, stiffness cost function and landmark distance cost function, respectively. α and β show the stiffness and landmark parameters. Given that the correspondent source and target vertices are (\({{\varvec{v}}}_{{\varvec{i}}},{{\varvec{u}}}_{{\varvec{i}}}\)), the distance cost function was defined as below:

$$ E_{d} \left( {\varvec{X}} \right): = \mathop \sum \limits_{{v_{i} \in {\mathcal{V}}}} w_{i} \left\| {{\varvec{X}}_{{\varvec{i}}} {\varvec{v}}_{{\varvec{i}}} - {\varvec{u}}_{{\varvec{i}}} } \right\|^{2} $$
(3)

where the reliability of the match was weight by wi. If there are no corresponding vertices, the weight is set to zero. If the correspondence is found, the weight is set to one. The stiffness cost function is used to regularize the deformation by penalizing the weighted difference of the transformations of neighbor vertices, and be defined using the Frobenius norm \({\Vert \cdot \Vert }_{F}\) and a weighting matrix G as below:

$$ E_{s} \left( {\varvec{X}} \right): = \mathop \sum \limits_{{\left\{ {i,j} \right\} \in {\mathcal{E}}}} \left\| {\left( {{\varvec{X}}_{{\varvec{i}}} - {\varvec{X}}_{{\varvec{j}}} } \right){\varvec{G}}} \right\|_{F}^{2} $$
(4)

where G was used to weight the difference between the rotational and skewed portions of the deformation against the translational portion of the deformation. Finally, the landmark distance cost function was used for initialization and guidance of the registration and be defined as following.

$$ E_{l} \left( {\varvec{X}} \right): = \mathop \sum \limits_{{\left( {v_{i} ,l} \right) \in {\mathcal{L}}}} \left\| {{\varvec{X}}_{{\varvec{i}}} {\varvec{v}}_{{\varvec{i}}} - {\varvec{l}}} \right\|^{2} $$
(5)

where \(\mathcal{L}=\{\left({v}_{{i}_{1}}, {l}_{1}\right),\dots \left({v}_{{i}_{l}}, {l}_{l}\right)\}\) represents a set of landmarks mapping source vertices into the target vertices. For matching the upper body surface accurately, we assume nipples and navel are applicable as the landmark. Then, we picked up those landmarks manually on both source and target body surfaces before applying the non-rigid ICP registration. X was obtained by solving the total cost function with Algorithm 1 with reference to [14]. The stiffness parameter \({\alpha }^{i}\) was set at {50, 20, 5.0, 2.0, 0.8, 0.5, 0.35, 0.3, 0.2} and weight parameter \({\beta }^{j}\) was set at {5.0, 2.0, 0.5} experimentally. The acceptable mismatch \(\varepsilon \) was set at 3 mm. The algorithm was developed based on Python framework. Open 3D library was used for the calculation and visualization of the point cloud data and scikit-sparse Python library was used to compute Cholesky decomposition of sparse matrix for solving the cost function. For the computation in all trials, the obtained point cloud data using the RGB-D camera was uniformly downsampled to 25,000 points. The implementation workstation PC is composed of Intel Xeon W-2133 CPU @ 3.60 GHz and 64 GB RAM.

figure a

Simulation

In this section, our hypothesis that selecting the utilized reference model closing to the patient body increases the accuracy of the localization is verified through the simulation at first. We prepared several types of human body model with a digital human platform software “DhaibaWorks” [18]. This software can generate the human body model with various shape as mesh data. With this software, nine types of body shape including light, heavy, small, and tall were prepared as shown in Fig. 2.

Fig. 2
figure 2

Human body models produced by DhaibaWorks

In this simulation, the reference body model (source mesh) is fixed at the standard model (#0 in Fig. 2) and the patient body (target mesh) are set at other models, and the non-rigid ICP is applied to each of the source and target pairs. The accuracy of registration is evaluated by the error between each of the valves’ position projected with the registration and the ground truth which was picked up by a clinical expert. Additionally, Chamfer Distance is calculated with each correspondence of the body models.

Table 1 shows the result of non-rigid ICP registration error of each valve position and Chamfer Distance with the simulated human body models. The auscultation areas I–IV in Table 1 represent aortic, pulmonary, tricuspid, and mitral valves, respectively. The result suggests the registration error increases as the deviation of the body shape between the targeted models and reference model (#0 in Fig. 2) is large. Also, the Chamfer Distance could show the degree of the shape deviation and its degree corresponds to the result of the registration error. This simulation result verifies our hypothesis that selecting the utilized reference model closing to the patient body increases the accuracy of the localization.

Table 1 Non-rigid ICP registration error and chamfer distance in simulation
Table 2 The information of volunteers

Experiment

For the experiment, we obtained 8 datasets of the body surface from male healthy volunteers. The male volunteers were selected considering the variety of body shape. The average tall, weight and body mass index (BMI) of volunteers are 1.73 ± 0.06 m, 69.1 ± 5.87 kg and 23.2 ± 1.98 kg/m2, respectively. The detailed information of the selected volunteers is listed in Table 2. The study protocol was approved by the Institutional Review Board of National Institute of Advanced Industrial Science and Technology (No. 2022-1154), and informed consents were obtained from each volunteer. The point cloud data of the body surface is acquired with a RGB-D camera (L515 RealSense, Intel, USA). The volunteers lie on the bed and the RGB-D camera is positioned about 80 cm above the bed surface. Noted that we gave the volunteers an instruction to hold their breath during the acquisition. As same as the simulation, the ground truth listening position of each valve on the acquired body surface was determined by the clinical expert. In this experiment, the registration errors and Chamfer Distance were calculated and compared for all pairs of the body surfaces of 8 volunteers. Under fixing one volunteer’s body data as source model, the registration between the source model and other volunteer’s body data as reference models was performed.

Table 3 shows the result of non-rigid ICP registration between all of each volunteer’s body. The auscultation areas I–IV in Table 3 represent aortic, pulmonary, tricuspid, and mitral valves, respectively. In Table 3, the minimum and maximum of the calculated Chamfer Distance and averaged registration error in each source body model were highlighted with bold fonts. Figure 3 also summarizes the results of the comparison of registration error and Chamfer Distance when the reference model is varied for each source model. The result showed that, in the six-eighths of source model conditions, the minimum Chamfer Distance corresponded to the minimum registration error, and the maximum Chamfer Distance corresponded to the maximum registration error. The average calculation time for the non-rigid ICP was 133.5 ± 5.8 s.

Table 3 Results of registration error between each body model
Fig. 3
figure 3

Results of comparison of registration error and chamfer distance when the reference model is varied for each source model

Figure 4 shows the result of all registration errors corresponding to the Chamfer Distance. In Fig. 4, we performed linear regression analyses to investigate the strength of the association between the accuracy of the non-rigid ICP registration and the similarity of compared models. The coefficient of linear determination R2 showed 0.66, which indicated some association between Chamfer Distance and registration error.

Fig. 4
figure 4

Results of registration error depending on the chamfer distance

Discussion and conclusion

The non-rigid ICP registration we have described is capable of estimating the auscultation area with average error 5–19 mm and is a promising new method that provides accurate auscultation area takes into account the individual difference of body shape. Our hypothesis that the registration accuracy depends on the similarity of both body surfaces is validated through simulation study and human trial. The statistical results indicate some correlation between the registration accuracy and the Chamfer Distance which is equivalent to the similarity of the utilized models. Since this study recruited the limited number of volunteers for human trials, it is necessary to perform a large-scale investigation. An acceptable registration error should be considered in terms of the quality of acquired sound, although the error of less than 20 mm may not affect the diagnosis qualitatively. Although there were no scientific literatures directly discussing the required positioning accuracy of stethoscope in auscultation from our survey, a recent paper performed human trials with a teleoperated-robotic auscultation system which enabled to search optimal positions obtaining qualified sounds [19], and demonstrated that the optimal position can be found if the search area is within 30 mm. This report may support that our method is applicable in auscultation, but we need to continue further the investigation about the acceptable positioning error not only in auscultation but also other applicable diagnoses such as ultrasonography.

The results also indicated that the registration error is not isotopic and varied depending on the individual subjects. Although the relationship between the error distribution and body shape was not clear due to the limited number of samples, focusing on the cases of minimal Chamfer Distance, which is selecting the most similar model, the registration error at mitral valve (IV) tended to be large. The locations of the tricuspid, mitral, pulmonary, and aortic valves is roughly at the left of the lower part of the sternum near the fifth intercostal space, over the apex of the heart in the left fifth intercostal space at the midclavicular line, over the medial end of the left second intercostal space, and over the medial end of the right second intercostal space, respectively [20]. Thus, the mitral valve is relatively far from the center axis of body compared to other three valves. As the RGB-D camera was set at the center of body, the point cloud data close to the center of body may be well captured. The registration error may have been amplified according to the distance between the mitral valve and the central axis of the body. It may be necessary to obtain data from multiple locations and perform registration to eliminate data coarseness.

On the other hand, considering that the resolution of point cloud data acquired with the RGB-D camera used in this study was between 5 and 14 mm, the performance of the registration was possibly maximized. The registration accuracy may be improved by using RGB-D camera with higher resolution such as structured-light 3D scanner. Also, in the process of the non-rigid ICP registration, we utilized nipples and navel as the landmark. If other landmarks on the body surface such as boundary of ribs are applicable, the registration accuracy may be further improved. Additionally, for evaluating the similarity of pairs of point cloud data, the Chamfer Distance was used for the metrics in this study, but we still need to investigate more appropriate metrics in terms of the robustness. For example, Ref. [21] is developing a new metrics for the point cloud similarity.

This registration method can be utilized for a navigation or pre-operative planning of robotic diagnosis and treatment that requires to place some medical equipment on the patient body. For example, there have been several researches of autonomous robotic ultrasonography and auscultation, which requires to recognize the scanning path or area based on the patient information autonomously [19, 22,23,24]. With the proposed registration method, the pre-determined scanning path or area can be estimated on the patient body taking account the individual deference of body shape only by using RGB-D camera.