Keywords

1 Introduction

Medical imaging makes use of emerging technology to improve people’s health and quality of life. Computer-assisted diagnostic (CAD) systems in medicine are a good example. Scientists are increasingly using X-rays, magnetic resonance imaging (MRI), cardiac magnetic resonance imaging (CMRI), computed tomography (CT), Mammography, and histopathology images (HIs).

Despite major breakthroughs in diagnosis and medical treatment, cardiovascular diseases (CVDs) remain the leading cause of death worldwide. According to a World Health Organization report, there were 17.9 million deaths attributed to CVDs in 2016. Cancer is another disease with a high mortality rate, with 9 million deaths. Both developed and developing countries are affected by cancer. Because of the increase in risk factors and late detection of diseases, death rates in low and middle-income nations are high. The early and precise detection of tumors and CVDs is the key point of treatment and diagnostic decision making [1, 2].

Prior diagnostic data should be reviewed then valuable information from previous data is obtained. Artificial intelligence (AI) applications in medical imaging have advanced exponentially in recent years as a result of technological advancements and increased computer capacity. In the image-based diagnosis procedure, machine learning (ML) is applied. It depends on previous clinical models through explicit programming identification of complex imaging data patterns. As ML technique ingest training data, it is then possible to produce more precise models depending on those training patterns. Existing review declares the incremental value of image-based diagnosis using ML methods [3, 4].

1.1 Medical Imaging

Rapid tumor detection and diagnosis using image processing and machine learning techniques can now be an important tool in increasing cancer diagnostic accuracy. Medical imaging is used for clinical diagnosis, therapy, and identifying problems in various body parts.

The goal of a medical imaging the purpose of this research is to establish the location and scale of the project, and features of the tissue or organ in question. This classification is thought to be a good technique to get useful information out of a vast volume of data. As a result, some scientists have focused their efforts in creating and interpreting medical images in order to diagnose the vast majority of diseases. As a result, medical images aid in illness identification, the detection of pathogenic abnormalities and the treatment of patients in a clinical setting.

The techniques and methods used to acquire images of various parts of the human body for diagnostic purposes are referred to as medical imaging. Different radiological imaging techniques are included in medical imaging such as:

X-Ray.

The brighter areas on the X-ray are solid tissues, while the darker areas include air or normal tissues. On an X-ray film of the chest, for example, Many organs that separate the chest cavity from the abdominal cavity, such as the heart, ribs, thoracic spine, and diaphragm, are readily visible. This can be used in lung infection detection [5].

CT/CMRI.

Significant aspects of the bodily organ, such as shape and size, must be understood in order to categorize the various disorders. Image processing tools such as CT or CMRI are used to develop the diagnosis of cardiac disease. This can be used in CVDs diagnosis [6,7,8].

Mammography.

Mammography is regarded as the simplest approach for early breast cancer diagnosis, using only a small amount of radiation. It aids radio-graphic Breast cancer examination to detect any growth or lump in the early stages, even before it becomes obvious to the doctor or the woman herself, and that these rays are not dangerous if used at yearly intervals, as recommended by the National Guidelines for early breast cancer diagnosis. The only method that has been proved to be effective in reducing breast cancer mortality by detecting the disease early on is mammography. Mammography is the most successful approach for early detection of breast cancer, despite the fact that it cannot prevent cancer [2, 9].

Histopathological Images (HI).

Despite fast developments in medical field research, the gold standard for tumor identification remains histology. HI is a type of medical imaging in which tissues from microscopy biopsies are shown. The pathologists can use these images to study tissues characteristics in a cell basis. Because HIs contain complicated geometric shapes and textures, they can be utilized to identify, monitor, and treat cancer in various organs such as the breast, lung, liver, lymph nodes, and so on… [10, 11].

1.2 Motivation

The purpose of this study is to show radiologists how to use machine learning techniques to enhance the rate of rapid and accurate cancer detection and CVD diagnosis and categorization. This research seeks to provide a review of novel applications of machine learning for the analysis of medical pictures, as well as an overview of progress in this field. This paper focuses on segmentation and feature extraction in multi-modal medical images of various areas of the human body that have lately been employed.

1.3 Paper Structure

The following is a breakdown of the paper’s structure. Section 2 presents a taxonomy for categorizing medical image analysis machine learning algorithms. Section 3 displays several supervised segmentation methodologies as well as supervised ML that was used for the segmentation methods. Section 4 introduces unsupervised machine learning (ML), which is used for segmentation, and then displays various unsupervised segmentation algorithms that aim to find essential structures in medical images, which may aid diagnosis. The feature extraction methods used to describe HIs for further categorization using ML are presented in Sect. 5. Finally, in Sect. 6, the conclusions are stated.

2 Machine Learning

Machine learning (ML) is a type of data analysis that automates the generation of analytical system models. It’s a subset of AI that governs how a machine learns from data, recognizes patterns, and makes decisions with little or no human assistance. ML is used to provide a pathological diagnosis of malignancy in a variety of tissues and organs (breast, prostate, skin, brain, bones, liver, and others). Machine learning methods have been widely used in segmentation, feature extraction, and classification [12].

Unsupervised and supervised machine learning methods are the two types of machine learning methods. Unsupervised learning organizes and interprets data based solely on input data, whereas supervised learning (classification and regression) creates prediction models based on both input and output data (clustering).

ML Medical analysis methods can be classified as illustrated in Fig. 1. Typically, pathology specialists are interested in tissue regions that are related to the condition being identified. The goal of medical segmentation is to Label pixels with the structure that they could represent.

Fig. 1.
figure 1

ML for medical images analysis

Nucleus structure identification, for example, can be used to extract morphological information like the number of nuclei per region, their size, and format, which can be particularly useful in evaluating a tumor’s diagnosis. Several segmentation methods are based on supervised or unsupervised machine learning techniques.

3 Supervised ML for Segmentation

Support vector machine (SVM), genetic algorithm (GA), decision trees (DT), regression trees (RT), and k-nearest neighbors algorithm are some of the supervised machine learning algorithms used for segmentation (k-NN).

SVMs are a type of learning machine that can recognize patterns and predict time series, among other things. The support vectors of selected samples map samples into feature space, and the greatest margin hyperplane separates feature vectors [12, 13].

GA is a search-based optimization technique depends on the idea of genetics and natural selection. Optimal or near-optimal solutions to complicated problems are found while the typical solution will consume a very long time to find out. GA searches a space of potential solutions to find one which solves the problem [14].

One of the predictive modelling methods is DT. DT moves from observations, which are represented the tree’s branches lead to inferences about the target value, which the tree’s leaves reflect. Classification trees are DT when the target variable is a discrete set of values. Class labels are the leaves in these tree structures, and feature combinations that lead to those class labels are the branches. When the target variable is a set of continuous values, such as real numbers, RT is DT [15].

A non-parametric classification method is the K-NN classification method. Data categorization and regression are both done with it. The k closest training samples from the data set are used as the input in all cases. Depending on the mode (classification or regression), the output changes. The outcome of k-NN classification is a class membership. The most common class of its neighbours is used to name an object based on a majority vote of its neighbours. The algorithm determines the value of a property for an object in k-NN regression. This property’s value is the average of its k closest neighbours’ values [12, 16].

The supervised segmentation algorithms are shown in Table 1 along with the ML methods they employ.

Table 1. Supervised segmentation approaches using different machine learning methods

4 Unsupervised Ml Segmentation

Unsupervised machine learning (ML) segmentation should discover patterns from untagged data and can be divided into several types, such as k-means, general vector machine (GVM), mean shift, and thresholding. The k-means technique is an unsupervised machine learning clustering approach that has been used to segment pixel regions. The K-means technique, which is an unsupervised clustering method, is used to separate the item from the background. It divides the input data into K-clusters, or groupings, based on the K-centroids. When unlabeled data, i.e. data with no established categories or groupings, the method is employed. The purpose is to locate specific groups based on some form of data similarity, with K being the number of groups [12].

The GVM is used to replace the SVM, which are support vectors of selected samples separated by the greatest margin hyper-plane. The support vectors are substituted by general project vectors chosen from the normal vector space, and the general vectors are found using the Monte Carlo (MC) process. GVM improves the capacity to extract features [26].

When a set of data points is given, the mean shift approach labels each data point towards the nearest cluster centroid iteratively, with the direction to the closest cluster centroid defined by where the majority of the neighbor points are. Each iteration brings each data point gets closer to the cluster centre, which contains the most data points. Each point is assigned to a cluster when the algorithm finishes [27, 28].

Decision scores, which are the output of the decision function that is used to produce the prediction, are employed in the thresholding approach. The best score from the output of the decision function can be chosen as the value of the decision threshold All decision score values less than this decision threshold value are considered negative, and all decision score values more than this decision threshold value are considered positive [29].

Table 2 depicts the unsupervised segmentation methodologies as well as the machine learning methods employed.

Table 2. Unsupervised segmentation approaches using different machine learning methods

5 Feature Extraction ML

Before doing classification, some methods rely on feature extraction from raw data. Feature extraction methods aim to reduce the granularity of the input and highlight relevant information related to the problem, such as the presence or absence of a specific element, the amount of that element, texture, shape, histogram, and so on, while providing a form that is unaffected by changes like translation, scaling, and rotation.

Prior to categorization, these issues necessitate the translation of picture pixels into meaningful features. Feature extraction methods take photographs and extract a reasonable number of characteristics from them that summarize the information they contain. Several different types of characteristics, such as shape, size, texture, fractal, and even a combination of these, have been used.

  1. 1-

    Feature Extraction Approaches

  2. 2-

    Deep Learning Feature Extraction.

In conclusion, due to the nature of medical images, particularly HIs, which contain complex geometric structures and textures, multiple types of characteristics need be merged in many cases for further description. As shown in Table 3, different approaches extract several types of characteristics metamorphic characteristics are useful for identifying geometric structures, but they are more difficult to obtain due to the extensive pre-processing required. Texture, on the other hand, is one of the most significant features for identifying items or regions of interest in a photograph.

Finally, the most recent techniques rely on deep feature extraction. They’re similar to a set of filters that extract geometric and textural features. As a result, deep features and deep approaches for medical image analysis appear to be quite promising.

6 Conclusion

There are different imaging modalities that the radiologists use in order to study the organ or tissue structure. The significance of each imaging modality is changing depending on the medical field. This review provides a brief description of the medical images significance using multi-modalities of different parts in human body; X-ray, CT, MRI, CMRI, Mammography and HI.

This review divides ML applications into supervised segmentation, unsupervised segmentation and feature extraction approaches and describes the various methods in ML was used to offer a summary of development in this area.

There are several supervised ML methods is used for segmentation such as SVM, GA, DT, RT and k-NN. On the other hand, unsupervised ML segmentation methods can be divided into methods such as K-means, GVM, mean shift and thresholding.

Textural characteristics, on the other hand, are crucial in segmentation and are more difficult to collect due to the extensive pre-processing required. Morphometric characteristics are crucial for identifying geometric structures, but they are more difficult to collect due to the need for extensive pre-processing. Finally, the most current feature extraction techniques use deep features to describe organ or tissue details. They’re like a series of filters for detecting geometric structures and textures. This research also demonstrates that some deep feature extraction algorithms for medical picture analysis appear to be extremely promising.

Table 3. Feature extraction approaches including deep learning approaches applied on histological images of different parts in human body to extract different features