1 Introduction to International Forum on Medical Imaging in Asia

Back to ten years ago, we invited some of the most experienced authors in the field and published the Springer book “Biomedical Imaging” [1], we did not foresee such a dramatic change in the theoretical research and practical development. We have since then witnessed the overwhelming activities in both academia and industry, remarkably the biannual International Forum on Medical Imaging in Asia (IFMIA). Upon the success in IFMIA 2019 in Singapore, we believe it is of value to provide the readers with a comprehensive coverage on medical imaging, with a selection of both original research and critical reviews in a Special Issue, thanks to the agreement of the JSPS Editorial Office.

Therefore, this article is to briefly introduce technical advancements in all imaging modalities for molecular, cellular, anatomical and functional imaging, with the case studies reported by some leading groups in Asia. Research topics include imaging instrumentation, registration, reconstruction, multimodality methods, noise filtering and image enhancement, segmentation, classification and feature detection, model based imaging, as well as system development and acceleration technologies.

2 Intelligent Signal Processing and Medical Instrumentation

Oral lesions are conventionally diagnosed using white light endoscopy and histopathology. In National Cancer Centre Singapore, a clinic protocol of virtual histology using an in vivo cellular imaging and real-time processing system (see Fig. 1) is reported by Cheong etc. [2], in which photoactivation and miniaturized confocal image scanning are utilized. In their experiments, fluorescence imaging of the human and murine oral cavities was carried out using the fluorescent dyes fluorescein sodium [3] and produced discriminant image signals. Embedded computational intelligence with real-time image processing, feature detection and visualization is demonstrated. Nevertheless, main constraints seem to be the memory limitation for FPGA implementation of complex tissue image alignment and 3-dimensional reconstruction.

Fig. 1
figure 1

Instrumentation of an endomicroscopic imaging system [2].

In the optical image feature detection with optical interferometry, fringe projection is a commonly used technique, and has brought a key issue, i.e. fringe pattern analysis [4], to the researchers in the domain. Extracting phase distribution from arbitrary phase-shifted fringe patterns is found useful in phase-shifting interferometry. The advanced iterative algorithm (AIA) and the windowed Fourier ridges and least squares fitting (WFRLSF) is invented, but both of the signal processing algorithms are sensitive to noise, which limits their applications to almost perfect fringe patterns. The windowed Fourier filtering (WFF) algorithm is proposed by Qian etc. from Nanyang Technological University (NTU) for both pre-filtering and post-filtering to suppress the noise [5]. Their simulation results show that with the effective noise suppression, the phase error is reduced to less than 0.1 rad.

For the optical medical instrumentation, design and fabrication of fiber-axicons for in vivo and in vitro cellular imaging and real-time processing system is often the key to a high-resolution system. A direct-laser writing fabrication process for micro-axicons is reported by Huang etc. from Zhejiang University (ZJU) [6]. A fiber-axicon-generated Bessel beam is utilized to write on UV-curable optical epoxy to form new axicons and axicon arrays, with satisfactory apex angle and proximity of the writing axicons. The fabricated axicons are capable of generating a quality Bessel beam with an excellent focusing performance.

In the quantitative image analysis and diagnostic applications in radiologic imaging, radiomics has emerged as a novel framework which allows to extract a large number of quantitative features from radiologic images and promises to improve the characterization of lesions providing potentially valuable information in the context of personalized medicine. However, radiomic features are known to be affected by technical parameters and feature extraction methodology. Refer to Fig. 2, a study by Jin et al. evaluates the robustness of CT radiomic features against the technical parameters involved in CT acquisition and feature extraction procedures using a standardized phantom, and verifies the feature robustness by using patient cases [28]. A total of 47 radiomic features of textures and first-order statistics are extracted on the homogeneous region from all scans. Intrinsic variability is measured to identify unstable features vulnerable to inherent CT noise and texture. Susceptibility index is defined to represent the susceptibility to the variation of a given technical parameter. Eighteen radiomic features are shown to be intrinsically unstable on reference condition. The features are more susceptible to the reconstruction kernel variation than to other sources of variation. The feature robustness evaluated on the phantom CT correlates with those evaluated on clinical CT scans. This study reveals that a number of scan parameters could significantly affect the radiomic features. These characteristics should be considered in a radiomic study when different scan parameters are used in a clinical dataset.

Fig. 2
figure 2

Example color map of the GLRLM gray level non-uniformity features that compare the susceptibility to the low dose, sharp kernel, large RFOV, and large sub-ROI size conditions in comparison with reference condition. The feature value range is scaled by dividing the median value for easy comparison [28].

3 Multi-Modal Image Reconstruction and Registration

Another important research topic in medical imaging is mathematical methodology of CT reconstruction. Intensive studies on this topic have been conducted at University of Tsukuba and other institutions [29, 30]. A fast iterative image reconstruction algorithm for short-scan fan-beam computed tomography is developed by minimizing a data-fidelity term regularized with a total variation penalty, in collaboration with ZJU [29]. The prior information obtained from probabilistic atlas constructed from earlier scans of different patients is effectively utilized in low-dose CT imaging, in collaboration with Suez Canal University, Egypt [30].

Besides, multiple modalities of medical imaging are often used in clinical diagnoses and interventional imaging processes. The most challenging task is registration of the acquired anatomical structure images and the surgical tools or designed implants. This is through voxelization of freeform models and synthesis of complex objects (see Fig. 3). A NURBS volume representation and its voxelization algorithm [7] are invented by Lin etc. between NTU and Beijing Normal University (BNU). The key is estimation of a forward difference bound to maximize the parameter step, thus speeding up the voxelization [8]. Although the mathematical proof of the bound is given in the work, a statistical model and adaptive bound may produce a more effective forward difference algorithm; and the model can be trained via 3D convolutional neural networks.

Fig. 3
figure 3

Hybrid femur with registered implants (left) vs CT images (right) [7].

To verify the volumetric image registration, a sub-voxel digital volume correlation (DVC) method combining the 3D inverse compositional Gauss-Newton (ICGN) algorithm with the 3D fast Fourier transform-based cross correlation (FFT-CC) algorithm is developed by Wang etc. from NTU [9]. The new algorithm can eliminate path-dependence in the conventional iterative DVC methods caused by the initial guess transfer scheme.

4 Machine Learning for Medical Segmentation and Classification

Medical segmentation and classification is largely based on feature extraction and detection. On cellular image feature extraction, pattern recognition and classification, a Springer monograph is published by Xu etc. between NTU and ZJU [10]. Using the antinuclear antibodies (ANAs) in patient serum as the subjects and the Indirect Immunofluorescence (IIF) technique as the imaging protocol, the Bag-of-Words (BoW) framework and a Linear Local Distance Coding (LLDC) method is introduced. A rotation invariant textural feature of Pairwise Local Ternary Patterns with Spatial Rotation Invariant (PLTP-SRI) is also defined which is robust to noise and weak illumination. While the proposed PLTP-SRI feature extracts local feature, the BoW framework builds a global image representation, thus aggregation of the two kinds of features in different aspects achieves excellent classification.

Also based on the feature detection, a statistical model for segmentation and identification of hormone response elements (HREs) in genomic sequences is reported by Stepanova etc. [11]. Based on the verified HREs carrying di-nucleotide preservation in comparison with uniform nucleotide distributions, both mono and di-nucleotide Position Weight Matrices are computed to extract the statistic pattern of the positions.

For temporal autocorrelation present in functional magnetic resonance images (fMRI), a mixed spectrum analysis (MSA) of the brain voxel time-series is proposed by Arun etc. [12]. It can segment the discrete component corresponding to input stimuli and the continuous component carrying temporal autocorrelation. In their experiments, varying correlation structure among the brain regions does not affect the efficiency of the method. Brain activation is detected by predicting the likelihood of activation by comparing the amplitude of discrete component at stimulus frequency across the brain voxels by using normal distribution and modelling spatial correlations among the likelihood with a conditional random field.

Aimed at segmenting and tracing filamentary structures in both neuronal and retinal images (see Fig. 4), a two-step graph-theoretical approach is proposed by Jedeep etc. between NTU and A-Star Bioinformatics Institute [13, 14]. The key idea is that the problem can be reformulated as label propagation over directed graphs, such that the graph is to be partitioned into disjoint sub-graphs, or equivalently, each of the neurons (vessel trees) is separated from the rest of the neuronal (vessel) network.

Fig. 4
figure 4

Segmentation results from various experimented algorithms (Disc in the Gold Standard labels the crossover of the filamentary structures) [13].

Recently, a generic and robust low-rank nonlinear kernelization in the framework of statistical shape models (SSM) is presented in [15] by Ma etc. between NTU and Fraunhofer Institute Singapore. It effectively solves data contamination and arbitrary corruptions in 3D medical image segmentation. The SSM and Deep Neural Networks are incorporated via Bayesian inference with the shape prior from SSM and initial structure localization from deep learning. The model shows great potentials for those use cases where training datasets are large enough.

Accurate segmentation of brain in MRI has been an important task in neuroimaging analysis and yet remained as a challenging issue due to the presence of equipment noise and the complexity of the brain structure. A new method based on the back propagation (BP) neural network and the AdaBoost algorithm is presented by Chao et al. in [27]. The system is trained using a gravitational search algorithm to establish 10 groups of back propagation neural network (BPNN) by applying 10 groups of different data. Subsequently, the AdaBoost algorithm is adopted to obtain the weight of each BPNN. In a comparison experiment using a group of brain MRI datasets, the proposed method outperforms the four state-of-the-art segmentation methods through subjective observation and objective evaluation indexes.

Endoscopic image analysis has an increasing importance due to the wide spread of minimally-invasive surgery. Deep learning-based real-time pathology classifications of endoscopic images have been pioneered by Nagoya University [31]. Musculoskeletal applications are becoming important in the super-aging society in Japan. CNN-based segmentation of vertebrae of X-ray video during swallowing is addressed by University of Tsukuba [32], while segmentation of individual muscles, bones and implants as well as metal artifact reduction from CT are addressed by Nara Institute of Science and Technology (NAIST) and Osaka University [33, 34] whose particular advantage is prediction of segmentation accuracy using uncertainty estimated from Bayesian U-net [34]. Statistical shape models are used for mandibular segmentation at NAIST in collaboration with University of Tehran [35]. The lung has been addressed by Yamaguchi University and Osaka University and machine-learning approaches are utilized for lung disease classification [36, 37] partly in collaboration with Dalian University of Technology [37].

5 Intelligent Computer-Aided Diagnosis and Interventional Imaging

Maximum Likelihood (ML) is a popular optimization criterion in phylogenetics and basic medicine. However, inference of phylogenies with ML is NP-hard. Recursive-Iterative-DCM3 (Rec-I-DCM3) is a divide-and-conquer framework that divides a dataset into smaller subsets (subproblems), applies an external base method to infer subtrees, merges the subtrees into a comprehensive tree, and then refines the global tree with an external global method. In [16] Du etc. present a novel parallel implementation of Rec-I-DCM3 for inference of large trees with ML. In diagnostic processes, they use RAxML as external base and global search. 6 large real-data alignments containing 500 to 7769 sequences are tested with satisfactory diagnostic accuracy. In the basic medicine, probability and statistic models are also proposed by Stepanova etc. for specialized transcription factors to recognize specific DNA sequences [17]. A Hopfield neural classifier is developed with the flexibility of internal structure being adapted recurrently for the target motif structure.

For telemedicine with mobile devices, a single-pass volume rendering algorithm is developed for the popular WebGL platform by Movania etc. [18]. A remarkable advantage is that it can directly run on most embedded and mobile devices, thanks to interface by the OpenGL ES 2.0 shading API, thus can be implemented efficiently on the embedded GPU in the mobile device, as shown in Fig. 5.

Fig. 5
figure 5

Single-pass volume rendering for WebGL compliant mobile device [18].

Considering multiple features of medical images, Yu etc. between Xiamen University and NTU define a set of visual features to represent the information of its color, texture and shape [19]. With the patch alignment framework, a new subspace learning method, termed Semi-Supervised Multimodal Subspace Learning (SS-MMSL), is invented to encode different features from different modalities into the subspace. It adopts the discriminative information from the labeled data to construct local patches and aligns these patches to get the optimal low dimensional subspace for each modality, achieving improved medical diagnostic accuracy. This approach is also taken by Jadeep etc. [20] in the filamentary tracing problem, in which the matrix-forest theorem is applied.

In cardiovascular disease diagnosis and prognosis, arrhythmia heartbeat classification is crucial in electrocardiogram to help prevent stroke or sudden cardiac death. A novel ECG arrhythmia classification method is reported by Yang etc. from Hebei University [21], addressing stacked sparse auto-encoders (SSAEs) and a softmax regression model. Via deep learning, the algorithm can hierarchically extract high-level features from huge amount of ECG data.

Interventional imaging is a crucial technique for computer-aided therapy and surgery. For real-time synthesis of medical objects, dynamics simulation of deformable models with anisotropic materials is introduced in a monograph by Cai etc. from NTU [22]. Their fibre-field incorporated corotational finite element model (CLFEM) can work directly with a constitutive model of transversely isotropic materials, displaying adaptive dynamic features of the anatomical structures, as illustrated by the video clips in Fig. 6.

Fig. 6
figure 6

Fibre-field incorporated corotational finite element model (Green dots are constraints).

As confocal laser endomicroscopy (CLE) is a minimally invasive optical technique that enables in vivo imaging of tissue structures, and also holds potential for guided biopsy procedures, application of CLE to imaging oral cavity lesions is reported by Thong etc. from Singapore General Hospital [23]. Along the direction of accurate interventional imaging is the development of a fiber-optic bending sensor based on the propagation of LP21 mode by Fan etc. from ZJU [24]. In the experiments, the new sensor achieves a sensitivity of 4.13 rad/m−1 and exhibits the temperature-immune, thus can detect both bending direction and bending angle with a large dynamic range.

Upon acquisition of images and sensed surgical tools, an interactive process relies on real-time feedback of the augmented imaging systems. GPU based volume rendering algorithms have attracted researchers, especially in a cloud environment to have ubiquitous data processing and visualization capability. A pervasive computing solution is presented by Movanian etc. [25, 26] for highly accurate and real-time volume rendering.

6 Concluding Remarks

Research topics addressed in the above sections include imaging instrumentation, registration, reconstruction, multimodality methods, noise filtering and image enhancement, segmentation, classification and feature detection, model based imaging, as well as system development and acceleration technologies. While we are positive to recent reports on machine intelligence, we would also present to the readers the achievements and challenging issues in much talked data mining and deep learning; we would thus leave open for the readers the question of whether machine intelligence can effectively work in medical imaging and clinical diagnosis.