Developing and Preliminary Validating an Automatic Cell Classification System for Bone Marrow Smears: a Pilot Study

Bone marrow smear examination is an indispensable diagnostic tool in the evaluation of hematological diseases, but the process of manual differential count is labor extensive. In this study, we developed an automatic system with integrated scanning hardware and machine learning-based software to perform differential cell count on bone marrow smears to assist diagnosis. The initial development of the artificial neural network was based on 3000 marrow smear samples retrospectively archived from Sir Run Run Shaw Hospital affiliated to Zhejiang University School of Medicine between June 2016 and December 2018. The preliminary field validating test of the system was based on 124 marrow smears newly collected from the Second Affiliated Hospital of Harbin Medical University between April 2019 and November 2019. The study was performed in parallel of machine automatic recognition with conventional manual differential count by pathologists using the microscope. We selected representative 600,000 marrow cell images as training set of the algorithm, followed by random captured 30,867 cell images for validation. In validation, the overall accuracy of automatic cell classification was 90.1% (95% CI, 89.8–90.5%). In a preliminary field validating test, the reliability coefficient (ICC) of cell series proportion between the two analysis methods were high (ICC ≥ 0.883, P < 0.0001) and the results by the two analysis methods were consistent for granulocytes and erythrocytes. The system was effective in cell classification and differential cell count on marrow smears. It provides a useful digital tool in the screening and evaluation of various hematological disorders. Electronic supplementary material The online version of this article (10.1007/s10916-020-01654-y) contains supplementary material, which is available to authorized users.


Preparation
To prepare for image acquisition with the acquisition terminal, users need to enter in patient information, such as smear number, medical record number, name, age, gender and date. Print a QR code with smear information and stick it on the BM smear (see Figure 1a). Then, inserting the smear into a global view box to set analysis number of nucleated cells and select analysis area of interest for scanning (see Figure 1a). The selection of analysis area was critical since it would directly impact the quality of cell images captured and it required technical skills and experiences. The system supports to select an appropriate analysis area (40 and 100) automatically, or randomly select an appropriate analysis area (40) and three regions of interest (100) within the appropriate analysis areas on the slide manually by the user (see Figure 7，8). Areas around the particles and the feathered edge of the film must be examined carefully (see Figure 9, 10). Besides the area mentioned above, the head, the tail and sides of the smear should also be scanned for existences of abnormal cells, clumped immature cells, and cancer cells including myeloma, lymphoma and metastasis (see Figure 9). The number of cell count for a smear was set to 500 generally, with the exception of very low or very high degree of myelodysplasia. For smear with myelodysplastic, it could count to 1000 nucleated cells for more detailed and accurate clinical information. The system supports counting of 5000 cells or more (see Figure 3，10).

Figure 7 Automatic appropriate analysis selection(40 plus 100)of a smear with lymphoma.
Automatic area selection algorithm imitates a pathologist's behavior to choose an area where nucleated cells are likely to be non-overlapping and distributed evenly. The area in green is an appropriate analysis area (40) automatically selected by the algorithm，and the area in cyan is regions of interest (100) automatically selected by the algorithm.   were selected manually, the algorithm automatically finds appropriate area for a 100 field on the slide. Red-colored boxes on the digital slide (40×) are the image fields actually captured by 100× objective lens. These fields are near the fat drops or close to the tail of the slide.

Acquisition of digital whole slide imaging (40)
Before WSI scanning, the system takes 24 focusing references points (4×6, interval space 6 mm) on the analysis area of the smear to calculate focal plane of the smear. Then, the software performs a precise focusing process on the first cell located in the image with the autofocusing algorithm, and starts a scanning across the analysis area of the smear. Finally, all the images captured were assembled into a seamless WSI by the software.

Acquisition of high magnification cell images (100)
When the WSI scanning was completed, the system lens switched from 40  to 100  oilimmersed objective automatically. The drops of oil on the slide is strictly controlled and monitored by an oil dropping sensor. Then, high magnification image of cells is acquired automatically. When the number of nucleated cells in the images meets the requested number of pre-determined, the process of image acquisition and cell classification by the system is done and stopped automatically.

Review result of cell count and issue a report
The results of cell classification could modify by the user in the acquisition terminal. Or upload the cell images and analysis results to the review terminal for experienced users to review and issue a BM report.