Abstract
The field of forensic science is experiencing significant growth, largely driven by the increasing integration of holographic and immersive technologies, along with their associated head-mounted displays. These immersive systems have become increasingly vital in resolving critical crimes as they facilitate communication, interaction, and collaboration. Given the sensitive nature of their work, crime investigators require substantial technical support. There is a pressing need for accurate documentation and archiving of crime scenes, which can be addressed by leveraging 3D scanned scenes to accurately represent evidence and expected scenarios. This study aims to develop an enhanced AR. system that can be deployed on hologram facilities such as the Microsoft HoloLens. The proposed system encompasses two main approaches, namely image classification and image segmentation. Image classification utilizes various deep learning models, including lightweight convolutional neural networks (CNNs) and convolutional Long-Short Term Memory (ConvLSTM). Additionally, the image segmentation approach is based on the fuzzy active contour model (FACM). The effectiveness of the proposed system was evaluated for both classification and segmentation tasks, utilizing metrics such as accuracy, sensitivity, precision, and F1 score. The simulation results indicate that the proposed system achieved a 99% accuracy rate in classification and segmentation tasks, positioning it as an effective solution for detecting bloodstain patterns in AR applications.
Avoid common mistakes on your manuscript.
1 Introduction
The crime scene is a pervasive global social issue. Crimes impact the quality of life of a certain nation, including economic prosperity and reputation. In addition, the crime rate has increased dramatically over the past few years [1, 2]. Preventive steps must be taken for law enforcement to minimize the crime rate. There is a need for advanced systems and innovative techniques to enhance crime analytics to protect their communities.
Augmented Reality (AR) and Virtual Reality (VR) are advanced technologies that have been used for crime scene investigation and analysis. AR is a similar technology to VR that shows holograms [3]. Unlike VR, which completely replaces the user's visual surroundings, AR blends the real world and the digital environment [4]. AR is defined in [5] as a technology that enables real-time viewing and interaction with virtual visuals superimposed on the real world. AR systems bring virtual things to the user, who remains physically stationary while having AR. That is because an AR gadget requires a positioning solution relative to some real-world reference [6]. AR technology enables the creation of novel collaborative experiences in which co-located people can interact with and view 3D virtual objects. Annotating a live video enables a remote user to work with a distant user to enhance the face-to-face collaborative experience [7].
According to Locard's principle of exchange, it is hard to interact without exchanging some material substance. The forensic investigations at crime sites are based on this principle. Among the common material substances, bloodstain pattern analysis, by definition, examines the long-term consequences of transient blood loss. To assist in reconstructing a murder scene, bloodstain patterns may be examined to establish whether witness and victim testimony is believable. According to the International Association of Bloodstain Pattern Analysts (IABPA), a bloodstain pattern is defined as "a grouping or distribution of bloodstains that reveals the method in which the pattern was deposited by regular or repeating form, order, or arrangement.". If the victim was discovered to have suffered blunt force trauma, there are a variety of common bloodstain patterns that can be detected at crime scenes.
Bloodstain pattern recognition is relevant to the crime scene investigation. A large loss of blood almost often follows violence in criminal activity. When huge amounts of blood gather and then dry on the surface, a saturation stain might form on a rug or linen. Blunt-ended things like golf clubs, candelabra, and other similar instruments can cause severe blood loss to the brain. What kind of materials may be uncovered in this case depends on who the victim or offender, or witness is (if any). It is because of the impact mechanism or, more accurately, the impact force that stains the surface. "Impact Pattern" refers to the fact that it is the result of anything affecting the blood [2]. The IABPA uses the term "Transfer Stains'' to describe the patterns of bloodstaining that develop when blood touches another surface. If there are bloody fingers, weapons, half-blooded shoes, and other bloody materials at the crime scene, it may be difficult to remove transfer stains. A crime scene can be partially or virtually rebuilt by using transfer stains, such as voids, saturation stains, and cast-off patterns.
In recent years, Object segmentation aims to partition an image into meaningful regions corresponding to different objects. Wang et al. [8] provide a comprehensive review of modern object segmentation approaches. Their work covers a wide range of techniques, including region-based methods, contour-based methods, and deep learning-based methods. They discuss the strengths and limitations of each approach, emphasizing the importance of context-aware segmentation for accurate object delineation.
Csurka et al. [9] present a survey spanning two decades of research in semantic image segmentation. They trace the evolution of segmentation methods, from early handcrafted features to the recent surge in deep neural networks. The authors highlight key milestones, such as the introduction of fully convolutional networks (FCNs) and the development of large-scale annotated datasets. Understanding this historical context provides valuable insights into the challenges faced by researchers and the progress made over time.
Image segmentation involves localizing objects based on natural language descriptions. Although not as extensively studied as other segmentation tasks, it has gained attention due to its practical applications. An author explores referring image segmentation methods. These methods leverage multimodal information, combining visual cues with textual descriptions. Investigating this area can lead to innovative solutions that bridge the gap between language and vision [10].
Accurate evaluation of segmentation results is crucial for assessing algorithm performance. Huang et al. [11] propose deep neural networks for target segmentation evaluation. They discuss various evaluation metrics, including intersection over union (IoU), Dice coefficient, and pixel-wise accuracy. Researchers must choose appropriate metrics based on their specific segmentation task and dataset. Understanding these evaluation techniques ensures rigorous assessment of segmentation models.
Financing is an issue that may confront HoloLens, a Microsoft tool as a real-life application for augmented reality devices [12]. It is quite burdensome as each investigator should have a set, relying on HoloLens for all investigations might be difficult for crime scene units. This may be overcome by creating a phone-based AR application instead of enabling other officers to use remote assistance on their devices, such as Dynamics 365, which can be used to collaborate instantly without having to use HoloLens. However, all related personnel must undergo training in using both the lens and the remote assistance to ensure the viability of collected evidence [13]. The study of [14] addressed this by using remote spatial interaction with the physical scene offered by mediated Reality.
Recent years have seen a surge in the development of deep learning architectures for References [15, 16]. This discussion contains an up-to-date description of the current hyperspectral classification architectures and a discussion of outcomes for common datasets. Neural network designs for well-known Hyper Spectral Images (HSI) datasets are examined in [17]. Another intriguing aspect of this research is that it focused on the effects of data augmentation, transfer learning, and residual learning on classification accuracy. Learning from a restricted training set is critical in HSI classification since training labels are typically few [18]. It is encouraging that new designs, such as [19], are emerging that can reduce the need for many tagged samples. Hybrid dilated residual networks, described in [20], are another innovative and fascinating technique.
Considering this discussion, it can be observed that there is a need to investigate automatic solutions based on image classification and segmentation to support the crime investigators in their work. Therefore, this paper aims to provide an efficient integrated system for bloodstain pattern recognition in crime scenes based on machine learning and deep learning algorithms. Additionally, we propose an image segmentation approach based on FACM to provide an integrated system.
The rest of the paper is organized as follows: Sect. 2 critically reviews the related works. Section 3 presents the proposed AR crime investigation system. Then, the experimental analysis, including hardware and software specifications, is given in Sect. 4. Section 5 is devoted to the discussion of this study and its findings. The conclusion and future perspectives are drawn in Sect. 6.
2 Related work
2.1 Technical review
As this study focuses on the application of AR in crime scene investigation, it would be prudent to evaluate some relevant literature. Mixed Reality (MR) is a practical method that enables remote collaboration through nonverbal communication, according to [20]. Their research centred on integrating multiple types of MR remote collaboration approaches, allowing for the expansion of MR's capabilities and user experience with a new variety of remote collaboration. In [20], the authors showed an MR system that incorporated 360-degree panorama photos into 3D reconstructed scenes. Using a novel technique that interacts with several 360 panoramic fields within these restructured scenes, a distant user can transition between many 360 scenes, such as live, past, present, and so on, thereby fostering a better knowledge of space and interactivity.
The authors, in [21], provided an innovative MR analysis system that gives 3D representations of several users in a collaborative setting. This strategy emphasized the importance of data on persons' movements and behavior and their interactions with digital objects. The authors recognized the inadequacy of other methods of analysis for this objective, and as a result, a novel device representing individuals using head-mounted gadgets was created. According to this work, the qualities of an MR device, such as the procedures for analysis, cannot be found in other equipment.
In contrast, [22] characterized the fundamental purpose of human factors research such as the development and deployment of Virtual Environments (VE) to improve human lives in various settings. In [7], the authors demonstrated the availability and presence of AR devices, which are explored in [22]. Typically, 2D peripherals are worn by professionals who access video feeds from 3D head-mounted devices and enhance them with spoken or digital data. A relevant concern is if these devices can also be utilized for remote consultations; therefore, the authors in [7] re-evaluated this device and found that participants had preferences for particular settings despite comparable usability scores.
2.2 Crime scene investigation review
Numerous studies, such as [23, 24], have emphasized using AR technology to investigate crime scenes. Recent research, like [25], demonstrated that AR technology could enable dispersed teams to conduct crime scene investigations. The authors, in [24], described the term crime scene as a strategy for comprehending and enhancing locations.
It is essential to record photographs and videos of the crime scene to examine the digital evidence for possible clues thoroughly. This approach was developed by [26] to use the collected footage to create a 3D representation of the crime scene. The results indicated that a realistic reconstruction can be obtained with advanced computer vision algorithms. This objective was reflected in [27], which investigated the usage of an AR annotation tool whose relevance was based on the need for forensic specialists to collect crime evidence promptly and contamination-free. This application enables forensic specialists to tag and share evidence at crime scenes. Using a qualitative methodology, [27] discovered that annotation could result in improved crime scene orientation, a streamlined collection procedure, and reduced administrative pressures. Existing annotation prototypes are technically limited due to time-consuming feature monitoring, but AR annotation is more promising, usable, and valuable for analyzing crime scenes. This is supported by [23], which emphasized the enhanced utility and effectiveness of forensic simulations and crime scene investigation in virtual environments utilizing augmented reality techniques. With the use of AR technology, useful tools, and quick access to key databases, law enforcement, and investigation personnel can mark and highlight evidence and conduct real-time examinations.
Alternatively, [28] illustrated how 3D documentation and data integration resolved reconstructive issues regarding the progression of pattern injuries. Moreover, [21] exhibited an MR system employing 360 panorama photographs in 3D reconstructed landscapes, allowing a remote user to switch between various 360 sceneries. Their study centered on integrating several forms of MR remote collaboration technologies, enabling the growth of MR's capabilities and user experience with a new sort of remote collaboration. They described MR as a plausible method of facilitating distant collaboration through nonverbal communication. AR and MR as new tools for combating crime and terrorism were discussed in [29].
2.3 Bloodstain pattern age recognition review
Detection of the age of bloodstain is an important issue in crime investigation. The detection can be obtained by several non-automatic experimental techniques such as white blood cells and blood plasma tests [30]. Another method proposed in [31] for Blood Stain Pattern Analysis (BPA) based on Raman spectroscopy. This method takes a long time to extract the blood sample from the crime scene and transport it to the experimentation platform before obtaining the result. In addition, it is not cost-effective due to the high cost of the required instruments. Therefore, there is a need to investigate some automatic and cost-effective methods for bloodstain recognition and detection.
There are a few works in the literature that investigated automatic BPA [32, 33]. One notable system, presented by Gee et al. in [34], utilizes AR to enable in-situ 3D annotation of physical objects and environments. This system integrates GPS and UWB positioning technology with real-time computer vision to create a virtual incident map. Investigators can collaboratively create a scene map with the help of a centralized control component.
Another system, known as IC-CRIME [35], supports investigators in creating a detailed and interactive re-creation of a crime scene's physical space. This is achieved through a combination of laser telemetry scans, digital photographs, and user-generated annotations [35].
While systems for presenting crime scene data enhance understanding of crime events, they do not actively assist in the investigative process. To address this gap, researchers have focused on developing software tools to support the processing and analysis of crime scenes. Two widely used tools in this regard are HemoSpat1 and BackTrack2, both of which offer a graphical user interface and automate certain calculations related to BPA.However, these existing tools still have drawbacks. On-scene actions, such as manually measuring stain coordinates, drawing reference lines, and capturing images without perspective distortions, remain tedious. Additionally, substantial user input is required for tasks like indicating reference lines, delineating scales, entering stain coordinates, and guiding the semi-automatic ellipse fitting process.
In recent years, attempts have been made to address these drawbacks and make BPA fully automatic. One approach proposed in [36] employs computer vision techniques to analyze individual stains in a crime scene and calibrate multiple spatter images into an overhead picture with a unified coordinate frame. Although the authors claim to obtain the region of origin, no error evaluation is provided for the fully reconstructed result.
The study in [37] aims to simplify BPA calculations, particularly for inexperienced users, using a simple image processing algorithm. This approach involves a four-step process analysis: blood color identification, marker identification, major axis angle calculation, and impact angle calculation. The proposed approach achieves approximately 10% error compared to the 2% error obtained by the manual process.
Fiducial markers and digital images in an automated and virtual framework are utilized in [38]. Fiducial markers are placed within a crime scene to establish a global coordinate frame, and individual stains are analyzed using an active bloodstain shape model (ABSM). The estimated impact and glancing angles are used to approximate the stains' flight paths linearly. Experiments involving synthetic crime scenes demonstrate the potential of this approach in analyzing bloodstain spatter patterns. However, no quantitative evaluation of error is performed against ground truth data. Like the previous approaches, this work automates only certain steps of BPA and does not address the capturing of digital images.
An examination of existing literature reveals that a limited number of prior studies have explored the subject of bloodstain segmentation in color images. The methods proposed in these studies primarily relied on skin color detection, specifically face segmentation [39]. Notably, these methods focused on determining the appropriate color space and thresholds for detecting pixels similar to red [40,41,42]. Additionally, the fast 8-connected component labeling method was employed to identify the suspected bloodstain region.
Deep learning has been involved in several research fields such as the medical applications [43,44,45], human biometric recognition [46,47,48], behavior detection [49, 50], and cybersecurity [51, 52]. Due to its efficiency and accuracy, this paper deploys several deep learning methods for BPA to detect the age of the bloodstain and recognize it among the objects in the video frames.
After a thorough analysis of all studies employing augmented reality technology with 3D scanning techniques, a few studies have provided evidence of achieving a high level of investigation process efficiency by assisting investigators with communication, interaction, and collaboration among local and remote officers. This study aims to introduce a new type of communicative and collaborative investigation to investigate new ways of enhancing investigative efficiency and speed in determining the actual crime scene. In this study, we aim to propose an enhanced AR system for effective communication and interaction in visual-spatial crime scenes. The contributions of this work can be illustrated as follows:
-
1-
To design a classification model with low processing time based on light-weight networks.
-
2-
To compare the designed model with the traditional deep learning models such as CNNs and LSTMs.
-
3-
To design an image segmentation method based on machine and deep learning methods.
-
4-
To propose a combined bloodstain pattern detection system which is suitable for embedded systems.
3 The proposed crime investigation system
This paper proposes a system for automatic bloodstain pattern detection based on artificial intelligence. This system comprises two main approaches, image classification and segmentation, as shown in Fig. 1. The classification task is requested to identify the age of the bloodstain over a certain surface among seven categories, while the segmentation task is requested to make a contour surrounding the bloodstain.
The classification approach consists of three stages. The first stage is required to fragment the input videos into frames. The second stage is the data pre-processing which shuffles and splits the fragmented frames from the videos into train, validation, and test subsets. The strategy is to split the dataset into 80% and 20% for training and testing, respectively. The validation process comprises k-fold cross-validation and hold-out calidation techniques. The k-fold is performed with k equals 10, while the hold-out is performed on the train and test subsets to take the average value of the resulting performance among the excuted simulation results. The third stage employs machine learning models to classify the video frames. In this task, we deployed some machine learning models such as SVM, KNN, RF, DT, MLP, QDA, and LR. In addition, we designed some deep learning models comprising some spatial algorithms lightweight models such as shallow networks, ConvLSTM, CNN, and CNN-ConvLSTM, along with some depthwise CNNs.
The second approach is image segmentation, which is based on FACM and consists of five stages. The first stage is required to preprocess the input images, while the second stage includes the fuzzy membership function. The third and fourth stages are required to initialize and evaluate the generated contour. The fifth stage is required to postprocess the segmented image.
3.1 Proposed DLMs
3.1.1 Lightweight CNNs
This section discusses shallow networks based on lightweight networks. The main objective of this method is to design a lightweight model which has a low processing time. To accomplish this objective, we designed a depthwise networks based on CNNs.
Depthwise convolutional neural networks (DCNNs) are a variant of CNNs that have gained popularity in recent years due to their improved efficiency and reduced computational cost. Unlike traditional CNNs, which perform convolution operations on the entire input feature map with a fixed number of filters, DCNNs perform convolutions separately for each input channel with a much smaller number of filters. This results in a significant reduction in the number of parameters and computations required, making DCNNs ideal for mobile and embedded devices.
The depthwise convolution operation can be represented mathematically as follows:
where \({x}_{i,j,k}\) is the value of the input feature map at spatial location \(\left(i,j\right)\) and channel \(k\), \({h}_{r,s,k}\) is the value of the depthwise convolution filter at spatial offset \(\left(r,s\right)\) and channel \(k\), and \({y}_{i,j,k}\) is the output value at the spatial location \(\left(i,j\right)\) and channel \(k\). \(R\) and \(S\) are the spatial dimensions of the filter, which are typically much smaller than the spatial dimensions of the input feature map.
In a depthwise convolutional layer, the input feature map has \({C}_{in}\) channels and the depthwise convolution is applied separately to each channel using a different set of filters. The resulting feature map has the same spatial dimensions as the input with \({C}_{in}\) times fewer channels. The output feature map is then fed to a pointwise convolutional layer, which applies a \(1\times 1\) convolution with \({C}_{out}\) filters to combine the reduced channels into a new feature map.
The pointwise convolution operation can be represented mathematically as follows:
where \({w}_{k,l}\) is the value of the pointwise convolution filter at input channel \(k\) and output channel \(l\), and \({z}_{i,j,l}\) is the output value at spatial location \(\left(i,j\right)\) and output channel \(l\). The pointwise convolution effectively performs a linear combination of the reduced channels to generate the final output feature map.
In this research, a proposed DCNN architecture is introduced, which comprises several layers, including a depthwise 2D convolutional neural network. The depth of this network was set at 192, and the input size is specified as 224 × 224. Additionally, a max pooling layer with a kernel size of 2, a relu activation layer, a flatten layer, and a dense layer are included in the model. The proposed DCNN is found to have a total of 16,860,871 parameters, all of which are deemed trainable without any untrained parameters being left out (Fig. 2).
3.1.2 Deep learning models
This paper provides a deep learning-based technique for augmented Reality. CNN and CLSTM are the core of the suggested deep learning strategies (ConvLSTM). The current state of the art is designing a deep learning model that can extract feature maps from input pictures and enroll these feature maps into a classification network to distinguish between normal and abnormal states. The performance of this design is determined by its ability to differentiate between normal and abnormal states with minimum false-positive rates. The key contribution is consequently developing an efficient framework for deep learning. Hierarchically organized convolutional, pooling, and ConvLSTM layers constitute this architecture. In addition, a classification network is acquired to manage the deep learning architecture's feature maps and assess whether the input photographs are normal.
-
CNN Models
CNNs have become a powerful tool for image processing and recognition tasks. A 2D CNN is a type of CNN that is specifically designed to process two-dimensional data, such as images. In this section, we provide an overview of the mathematical concepts behind 2D CNNs. A 2D CNN consists of several layers, including convolutional layers, pooling layers, and fully connected layers. The convolutional layers are responsible for detecting features in the input image, while the pooling layers are used to reduce the dimensionality of the output of the convolutional layers. The fully connected layers are used to classify the image based on the features detected in the convolutional and pooling layers. The main operation in a 2D CNN is the convolution operation, which is defined as follows:
where f and g are two matrices, and * represents the convolution operation. The convolution operation involves sliding the filter g over the input matrix f, and computing the dot product between the filter and the corresponding portion of the input matrix. The convolutional layer in a 2D CNN consists of a set of filters, each of which is responsible for detecting a particular feature in the input image. Let F be the set of filters, and let f be the input image. The output of the convolutional layer can be computed as follows:
where H is the output feature map, F_m,n,k is the weight of the k-th filter at position (m,n), f_i-m,j-n is the input pixel value at position (i-m,j-n), b_k is the bias term for the k-th filter, and is the activation function. Common activation functions include ReLU, sigmoid, and tanh. After the convolutional layer, the output feature map is typically passed through a pooling layer, which is used to reduce the spatial dimensionality of the feature map. The most common type of pooling is max pooling, which involves taking the maximum value in each sub-region of the feature map. Let H be the output of the convolutional layer, and let P be the output of the pooling layer. The output of the pooling layer can be computed as follows:
where s is the stride of the pooling operation, which determines the amount of overlap between adjacent sub-regions. In conclusion, 2D CNNs are a powerful tool for image processing and recognition tasks, and are based on the mathematical concepts of convolution and pooling. These operations are used to detect features in the input image and reduce its dimensionality, leading to better accuracy and efficiency in image classification tasks.
CNN is the foundation for the first deep learning model proposed. Five convolutional layers (CNV) are followed by five pooling layers in this model (PL). This hierarchy is implemented to extract features from the input photos and build a feature map that is then enrolled in the classification network. Each CNV. A layer of the deep learning architecture generates a feature map with the same amount of digital filters as its layer. In addition, the pooling layer is used to reduce the number of features. The pooling layer can be implemented using two distinct methods (max. pooling and mean pooling). The feature map is divided into rectangular windows of a specific size. The max-pooling technique collects the largest value from each window, whereas the mean pooling technique extracts the window's mean value.
The classification network comprises two layers (a fully connected layer and a classification layer). This layer manages the generated feature map from the hierarchy of convolutional and pooling layers. This layer turns the 3D feature map into a feature vector, which is then enrolled in the classification layer, which classifies the feature vector and determines whether the input image's feature vector belongs to the normal or abnormal category (Fig. 3).
-
ConvLSTM Model
Convolutional LSTM is a type of recurrent neural network (RNN) that can process sequential data with both spatial and temporal dependencies. It is commonly used in various fields such as computer vision, natural language processing, and speech recognition. The convolutional LSTM architecture combines the concepts of CNNs and LSTMs, allowing for the efficient processing of spatially and temporally correlated data.
The convolutional LSTM can be expressed mathematically as:
where \({i}_{t}\), \({f}_{t}\), \({o}_{t}\), \({c}_{t}\), and \({h}_{t}\) are the input gate, forget gate, output gate, cell state, and hidden state at time step \(t\), respectively. \({x}_{t}\) is the input at time step \(t\), \({h}_{t-1}\) is the hidden state at the previous time step, \(W\) and \(b\) are the weights and biases of the network, and \(\sigma\) and \(tanh\) are the sigmoid and hyperbolic tangent activation functions, respectively. The symbol \(*\) denotes the convolution operation, and \(\circ\) denotes the element-wise multiplication.
The input gate \({i}_{t}\) controls the amount of information from the input and the previous hidden state that is used to update the cell state \({c}_{t}\). The forget gate \({f}_{t}\) controls the amount of information from the previous cell state that is retained or discarded. The output gate \({o}_{t}\) controls the amount of information from the current cell state that is used to compute the hidden state \({h}_{t}\).
The convolutional LSTM uses the convolution operation to capture the spatial dependencies of the input, and the LSTM structure to capture the temporal dependencies. This combination makes it particularly effective for tasks such as video analysis, where the data has both spatial and temporal correlations.
In this study, another proposed hybrid deep learning model incorporating both ConvLSTM and CNN modalities. ConvLSTM is the 2D version of the LSTM algorithm. LSTM is designed to remember prior states and construct the present state. This modality is a double-edged sword because the current state depends entirely on prior states. Therefore, the decline in such a state will negatively impact the subsequent states. Therefore, deep learning methods must be treated with care and watched during training to identify potential anomalies. This model for deep learning consists of ten layers. Following a ConvLSTM layer is a pooling layer. Then, three CNV layers are applied, followed by three pooling layers. The classification network is identical to the initial model of deep learning. Unlike the original CNN-based deep learning model, this model consists of a smaller number of layers. This model is intended to simplify the design of the deep learning model (Fig. 4).
3.2 Fuzzy active contour model (FACM)
The FACM is a segmentation method that uses fuzzy sets and active contours to delineate object boundaries in an image. The FACM combines the traditional active contour model with a fuzzy set representation of the image to provide more robust and accurate segmentation results.
The FACM energy function is defined as:
where \({E}_{imt}\left(x\right)\) is the internal energy of the contour, \(\lambda\) is a weighting parameter that balances the influence of the internal energy and the fuzzy energy, \(\sigma\) is a parameter that controls the spatial extent of the fuzzy membership function, and \(d\left(x\right)\) is the distance between a point \(x\) and the contour \(C\). The fuzzy membership function is defined as:
where \(\mu \left(x\right)\) represents the degree of membership of a point \(x\) in the region inside the contour \(C\).
The internal energy \({E}_{int}\left(F\right)\) of the contour is given by:
where \(F\left(x\right)\) is the level set function that represents the contour, \(\nabla F\left(x\right)\) and \({\nabla }^{2}F\left(x\right)\) are the gradient and Laplacian of \(F\left(x\right)\), and \(\alpha\) and \(\beta\) are weighting parameters that control the influence of the first and second-order derivatives of \(F\left(x\right)\), respectively.
The FACM approach involves minimizing the energy function \(E\left(F(x)\right)\) with respect to F(x). The optimization problem is solved using the level set method.
-
Level Set Optimization
Level set optimization is a numerical method for solving optimization problems that involve constraints. The method involves representing the feasible region of the optimization problem as the zero-level set of a function, known as the level set function. The level set function is defined to be positive inside the feasible region and negative outside the feasible region.
The optimization problem can be formulated as finding the minimum or maximum of an objective function subject to the constraint that the level set function is zero. This can be written mathematically as \(E\left(F\left(x\right)\right)\).
where \(g\left(x\right)\) is the level set function. To solve this problem, the level set function is evolved using a partial differential equation known as the Hamilton–Jacobi equation (HJE), which takes the form:
where \(q\) represents the position of the system, \(\nabla g(x)\_\) is the gradient of the action concerning the position, and \(H\) is the Hamiltonian, which is a function of the position and momentum of the system. The evolution of the level set function is guided by the gradient of the objective function, which tends to move the level set function towards the minimum of \(E\left(F(x)\right)\).
4 The experimental analysis
4.1 Samples of dataset
The proposed models are based on the Bloodstain Pattern Data Set (BSPDS) [53], a dataset including seven categories. These categories belong to the bloodstain pattern recognition. They are named "Plexiglas with fingers," "Plexiglas with a finger," "Plexiglas with a finger after 30 s", "Plexiglas with a finger after 90 s", "Plexiglas with a finger after 240 s", "Plexiglas with a paper towel" and "Plexiglas with a paper towel after 90 s", as shown in Fig. 5. In addition, a data augmentation technique (Convolutional Generative Adversarial Networks (CGAN)) is performed on the user data to increase the amount. Table 1 shows the frames in both trains, validation, and test phases.
Sample of bloodstain dataset [53]
4.2 Evaluation metrics
To rank the quality of the proposed solutions, various metrics are used. The F1-Score is based on the metrics of accuracy, Recall, Precision, and Accuracy. The corresponding equations define the measurements from (14) to (17).
where:
-
1)
The number of sleepy states that were incorrectly labeled "normal" is shown in the false-negative column.
-
2)
The True Positive metric indicates the percentage of drowsy states that were accurately identified.
-
3)
The True Negative value indicates the proportion of false negatives correctly identified as false positives.
-
4)
The number of times a normal status was mistakenly labeled as a drowsy status is shown in the false positive.
$$Accuracy=\frac{No.\;of\;correctly\;detected\;images}{Total\;No.\;of\;images}\times100$$(15)$$Recall=TPR={T}_{P}/({T}_{P}+{F}_{N})=(1-FNR)$$(16)$$precision={T}_{P}/({T}_{P}+{F}_{P})$$(17)$$F1=2*((precision*recall)/(precision+recall))$$(18)
4.3 Hyperparameters Selection
The hyperparameters of the proposed models have been selected after several trials to achieve their optimal values of them. The proposed deep learning models are designed to achieve as trainable parameters as possible. Table 2 illustrates the total, trainable, and untrainable parameters of each deep learning model. It can be observed that the proposed DCNN and shallow networks have a fully trained parameter without any untrained ones. On the other hand, the proposed deep learning models, including ConnvLSTM and CNN-ConvLSTM, have some untrainable parameters of 486 and 902 for ConvLSTM and CNN-ConvLSTM, respectively. Therefore, as training wise, the DCNN and shallow networks outperform the deep learning models.
Moreover, the classification task is carried out using some machine learning models. The hyperparameters of the proposed models are optimized using grid search technique. Table 3 illustrates the selected hyperparameters of these models.
5 Results and discussions
5.1 Result of image classification approach
This proposes an automatic system for bloodstain pattern detection in crime scenes. The objective is to determine the age of the bloodstain, which is important to determine the time of the crime occurrence as well as detect the bloodstain in a certain captured frame. To achieve this objective, the proposed systems comprises both image classification and image segmentation approaches. The classification is performed using several deep learning and machine learning algorithms. The simulation results have been performed using local machine with the following specifications; Intel core i9 CPU, 128 GB RAM, 32 NVIDIA GPU.
The deep learning methodology includes pixel-wise and depth wise analysis. The pixel-wise method consists of shallow and Deep CNNs, while the depth-wise method contains a structure of convolutional neural networks. Figure 6a shows the learning curve of the proposed depth-wise CNN which represents the accuracy of detection during the training. It can be obtained that the accuracy value is stable around 100% because of its complex analysis based on the channels and the depth of the input image. Figure 6b shows the performance of the proof shallow networks, which is a 1-layer based network. Its fluctuation can be observed through the training till epoch number three; then, the performance becomes stable till the end of the process. This issue occurs due to the gradual learning from the input images. On the other hand, the 2-layer shallow network is learned smoothly during the training process. This smooth performance occurs due to the additional convolutional layer, which adds value to learning and feature extraction from the input images.
Furthermore, this paper proposes other deep learning methods based on ConvLSTM and CNN-ConvLSTM models. Figure 7 shows the proposed deep learning models including ConvLSTM and CNN-ConvLSTM models. It can be observed that the proposed ConvLSTM model achieved the optimal performance after seven epochs, while the proposed CNN-ConvLSTM model achieved the optimal performance at epoch number two. This difference between the learning capabilities occurs due to the addition of CNN which provides an aid to the feature extraction of the input images.
Moreover, this paper proposes machine learning algorithms for bloodstain image classification including DT, SVM, RF, LR, QDA, MLP, and KNN. Figure 8 shows the learning curves and performance of the proposed machine learning models. It can be observed that these models achieves an optimal performance regarding bloodstain image classification. The experimental results of the proposed methods are illustrated in Table 4 and demonstrate their effectiveness in efficiently classifying blood stains. Including recall, precision, F1 score, and accuracy. However, the processing time of these models is a crucial factor to consider. Notably, the shallow CNN model with one layer outperforms the deep learning models in terms of real-time applicability. Furthermore, the simulation results are validated by hold-out validation and k-fold with k equals to 10 and we calculated the average accuracy to prove the reliability of the system. Moreover, machine learning models such as DT, MLP, and LR demonstrate efficient time processing within milliseconds, averaging at 21 ms. These results suggest that the proposed machine learning models can be considered as efficient real-time solutions for blood stain pattern recognition.
5.2 Results of image segmentation
This section discusses the image segmentation approach based on FACM technique. The proposed method is carried out on an image as a sample from each category. The segmented images are validated by ground truth references to be evaluated. The evaluation metrics include accuracy, precision, recall, f1-score, and MCC, as illustrated in Table 5. The resulting segmented images are obtained after 40 iterations, as shown in Fig. 9. The simulation results reveal that the proposed method achieved a high performance regarding image segmentation with average accuracy of 99%. Therefore, it can be considered as an efficient solution for bloodstain pattern segmentation.
This work proposes an AI-based solution for bloodstain pattern recognition. The proposed system composes two approaches. The first approach is to classify the input images into seven categories presenting the description of the captured bloodstain image. In this approach, we deployed several deep learning and machine learning models. As a real time application, the objective is to achieve an accurate and speedy method. Figure 10 shows a comparison of testing time among the proposed models. It can be obtained through that the proposed machine learning models have a low testing time rather than the deep learning models. In addition, the optimal model among the deep learning models is the shallow network with 1-layer architecture. On the other hand, the optimal among the proposed machine learning models are MLP and LR models. Therefore, the proposed solutions can be deployed in real-time applications.
6 Conclusion and future perspectives
The study of bloodstain pattern recognition is a significant focus in forensic analysis. In this paper, we propose an automatic blood stain pattern recognition system based on deep learning and machine learning modalities for image classification and segmentation tasks. The classification task method incorporates machine learning techniques such as SVM, KNN, RF, MLP, LR. and DT. Additionally, we have utilized deep learning models including CNN, ConvLSTM, DCNN, and CNN-ConvLSTM. Furthermore, we have proposed an image segmentation method based on FACM. The proposed system is evaluated in terms of its performance and processing time with different models. The proposed system has demonstrated high performance in image recognition and segmentation, surpassing previous works in the literature. Therefore, these methods can be considered efficient solutions for bloodstain pattern recognition and their application in crime scene investigation. In the next phase, we plan to deploy this model in real-time using a set of sensors and micro-controllers to create a prototype. Our ultimate aim is to achieve the highest time and cost efficiency.
Data availability
Data is available on demand.
References
Rezey ML, Lauritsen JL (2023) Crime reporting in Chicago: A comparison of police and victim survey data, 1999–2018. J Res Crime Delinq 60:664–699
Siegel M, Ross CS, King C III (2013) The relationship between gun ownership and firearm homicide rates in the United States, 1981–2010. Am J Public Health 103:2098–2105
Radianti J, Majchrzak TA, Fromm J, Wohlgenannt I (2020) A systematic review of immersive virtual reality applications for higher education: Design elements, lessons learned, and research agenda. Comput Educ 147:103778
Lorenzo G, Gilabert A, Lledó A, Lorenzo-Lledó A (2023) Analysis of trends in the application of augmented reality in students with ASD: Intellectual, social and conceptual structure of scientific production through WOS and scopus. Technol Knowl Learn 1–22
Aukstakalnis S (2016) Practical augmented reality: A guide to the technologies, applications, and human factors for AR and VR. Addison-Wesley Professional
Brown G, Prilla M (2019) Evaluating pointing modes and frames of reference for remotely supporting an augmented reality user in a collaborative (virtual) environment: evaluation within the scope of a remote consultation session. Proc Mensch und Comput 2019:713–717
Cirulis A (2019) Ultra wideband tracking potential for augmented reality environments. In Augmented Reality, Virtual Reality, and Computer Graphics: 6th International Conference, AVR 2019, Santa Maria al Bagno, Italy. Proceedings, Part II 6. Springer International Publishing, pp 126–136
Wang Y, Ahsan U, Li H, Hagen M (2023) A comprehensive review of modern object segmentation approaches. Found Trends Comput Graph Vision 13(2–3):111–283. https://doi.org/10.1561/0600000097
Csurka G, Volpi R, Chidlovskii B (2022) Semantic image segmentation: Two decades of research. Found Trends® Comput Graph Vision 14(1–2):1–162
Varkarakis V, Bazrafkan S, Corcoran P (2018) A deep learning approach to segmentation of distorted iris regions in head-mounted displays. In 2018 IEEE Games, Entertainment, Media Conference (GEM). IEEE, pp 1–9
Huang J, Cheng Y, Zhang Y (2020) Image segmentation evaluation: a survey of methods. Artif Intell Rev 53(4):2551–2566. https://doi.org/10.1007/s10462-020-09830-9
Paoletti ME, Haut JM, Plaza J, Plaza A (2019) Deep learning classifiers for hyperspectral imaging: A review. ISPRS J Photogramm Remote Sens 158:279–317
Poelman R, Akman O, Lukosch S, Jonker P (2012) As if being there: mediated reality for crime scene investigation. In Proceedings of the ACM 2012 conference on computer supported cooperative work, pp 1267–1276
Paoletti ME, Haut JM, Plaza J, Plaza A (2019) Deep learning classifiers for hyperspectral imaging: A review. ISPRS J Photogramm Remote Sens 158:279–317
Audebert N, Le Saux B, Lefèvre S (2019) Deep learning for classification of hyperspectral data: A comparative review. IEEE Geosci Remote Sens Mag 7:159–173
Li S, Song W, Fang L et al (2019) Deep learning for hyperspectral image classification: An overview. IEEE Trans Geosci Remote Sens 57:6690–6709
Romaszewski M, Głomb P, Cholewa M (2016) Semi-supervised hyperspectral classification from a small number of training samples using a co-training approach. ISPRS J Photogramm Remote Sens 121:60–76
Pan B, Shi Z, Xu X (2018) MugNet: Deep learning for hyperspectral image classification using limited samples. ISPRS J Photogramm Remote Sens 145:108–119
Cao F, Guo W (2020) Deep hybrid dilated residual networks for hyperspectral image classification. Neurocomputing 384:170–181
Okwuashi O, Ndehedehe CE (2020) Deep support vector machine for hyperspectral image classification. Pattern Recognit 103:107298
Cao X, Ge Y, Li R et al (2019) Hyperspectral imagery classification with deep metric learning. Neurocomputing 356:217–227
Teo TA, Lee G, Billinghurst M, Adcock M (2019) 360Drops: Mixed reality remote collaboration using 360 panoramas within the 3D scene. In SIGGRAPH Asia 2019 Emerging Technologies, pp 1–2
Prilla M, Rühmann LM (2018) An analysis tool for cooperative mixed reality scenarios. In 2018 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). IEEE, pp 31–35
Spain R, Goldberg B, Hansberger J, Griffith T, Flynn J, Garneau C, ... & Finseth T (2018) Me and my VE, part 5: Applications in human factors research and practice. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, vol 62, No 1. Sage CA: SAGE Publications
Rice R (2018). Augmented reality tools for enhanced forensics simulations and crime scene analysis. In Working Through Synthetic Worlds. CRC Press, pp 201–1213
Sandvik K, Waade AM (2008) Crime scene as augmented reality: On Screen, online and offline. Work Pap from Crime Fict Crime J Scand 5:1–17
Rühmann LM, Prilla M, Brown G (2018) Cooperative mixed reality: an analysis tool. In Proceedings of the 2018 ACM International Conference on Supporting Group Work, pp 107–111
Bostanci E (2015) 3D reconstruction of crime scenes and design considerations for an interactive investigation tool. Int J Inf Secur Sci 4:50–58
Streefkerk JW, Houben M, van Amerongen P, ter Haar F, Dijk J (2013) The art of csi: An augmented reality tool (art) to annotate crime scenes in forensic investigation. In Virtual, Augmented and Mixed Reality. Systems and Applications: 5th International Conference, VAMR 2013, Held as Part of HCI International 2013, Las Vegas, NV, USA. Proceedings, Part II 5. Springer Berlin Heidelberg, pp 330–339
Buck U, Naether S, Räss B et al (2013) Accident or homicide–virtual crime scene reconstruction using 3D methods. Forensic Sci Int 225:75–84
Lehr P, Lehr P (2019) Surveillance and observation: the all-seeing eyes of big brother. Counter-Terrorism Technologies: A Critical Assessment 115–129
Bremmer RH, De Bruin KG, Van Gemert MJC et al (2012) Forensic quest for age determination of bloodstains. Forensic Sci Int 216:1–11
Boyd S, Bertino MF, Seashols SJ (2011) Raman spectroscopy of blood samples for forensic applications. Forensic Sci Int 208:124–128
Liu Y, Attinger D, De Brabanter K (2020) Automatic classification of bloodstain patterns caused by gunshot and blunt impact at various distances. J Forensic Sci 65:729–743
Acampora G, Vitiello A, Di Nunzio C, Saliva M, Garofano L (2015) Towards automatic bloodstain pattern analysis through cognitive robots. In 2015 IEEE International Conference on Systems, Man, and Cybernetics. IEEE, pp 2447–2452
Gee AP, Escamilla-Ambrosio PJ, Webb M, Mayol-Cuevas W, Calway A (2010) Augmented crime scenes: virtual annotation of physical environments for forensic investigation. In Proceedings of the 2nd ACM Workshop on Multimedia in Forensics, Security and Intelligence, pp 105–110
Bahamón JC, Cassell BA, Young RM, Cardona-Rivera RE, Thomas JM, Hinks D (2011) Toward collaborative, web-based 3d environments for the investigation, analysis, annotation and display of virtual crime scenes
Shen AR, Brostow GJ, Cipolla R (2006) Toward automatic blood spatter analysis in crime scenes
Boonkhong K, Karnjanadecha M, Aiyarak P (2010) Impact angle analysis of bloodstains using a simple image processing technique. Sonklanakarin J Sci Technol 32:169
Joris P, Develter W, Jenar E et al (2015) HemoVision: An automated and virtual approach to bloodstain pattern analysis. Forensic Sci Int 251:116–123
Chauhan AS, Silakari S, Dixit M (2014) Image segmentation methods: A survey approach. In 2014 Fourth International Conference on Communication Systems and Network Technologies. IEEE, pp 929–933
Wang YW, Huang DY, Hu WC, Ho CY (2011) Bloodstain segmentation in color images. In 2011 First International Conference on Robot, Vision and Signal Processing. IEEE, pp 52–55
Arthur RM, Humburg PJ, Hoogenboom J et al (2017) An image-processing methodology for extracting bloodstain pattern features. Forensic Sci Int 277:122–132
Kulwa F, Li C, Zhao X et al (2019) A state-of-the-art survey for microorganism image segmentation methods and future potential. IEEE Access 7:100243–100269
El‐Hag NA, Sedik A, El‐Banby GM, El‐Shafai W, Khalaf AA, Al‐Nuaimy W, ... El‐Hoseny HM (2021) Utilization of image interpolation and fusion in brain tumor segmentation. Int J Numer Method Biomed Eng 37(8):e3449
Sedik A, Iliyasu AM, Abd El-Rahiem B, Abdel Samea ME, Abdel-Raheem A, Hammad M, Peng J, Abd El-Samie FE, Abd El-Latif AA (2020) Deploying machine and deep learning models for efficient data-augmented detection of COVID-19 Infections. Viruses 12:769
Elaskily MA, Elnemr HA, Sedik A et al (2020) A novel deep learning framework for copy-moveforgery detection in images. Multimed Tools Appl 79:19167–19192. https://doi.org/10.1007/s11042-020-08751-7
Abd El-Rahiem B, Sedik A, El Banby GM et al (2020) An efficient deep learning model for classification of thermal face images. J Enterp Inf Manag. https://doi.org/10.1108/JEIM-07-2019-0201
El-Moneim SA, Sedik A, Nassar MA et al (2021) Text-dependent and text-independent speaker recognition of reverberant speech based on CNN. Int J Speech Technol 24:993–1006. https://doi.org/10.1007/s10772-021-09805-3
Shafik A, Sedik A, Abd El-Rahiem B, El-Rabaie ESM, El Banby GM, Abd El-Samie FE, Iliyasu AM (2021) Speaker identification based on Radon transform and CNNs in the presence of different types of interference for Robotic Applications. Appl Acoust 177:107665
Sedik A, Marey M, Mostafa H (2023) WFT-Fati-Dec: enhanced fatigue detection AI system based on wavelet denoising and fourier transform. Appl Sci 13:2785
Siam AI, Soliman NF, Algarni AD, Abd El-Samie FE, Sedik A (2022) Deploying machine learning techniques for human emotion detection. Comput Intel Neurosc 2022(1):8032673
SeragEldin SM, Sedik A, Alshamrani SS, Ayoup AM (2023) Cancellable multi-biometric feature veins template generation based on SHA-3 hashing. Comput Mater Contin 74:733–749. https://doi.org/10.32604/cmc.2023.030789
Sedik A, El-Latif AAA, El-Affendi M, Mostafa H (2023) A cancelable biometric system based on deep style transfer and symmetry check for double-phase user authentication. Symmetry (Basel) 15:1426
sedik A, Albeedan M, Kolivan K, Abd El-latif AA (2022) Blood stain pattern data set (BSPDS). Mendeley Data V1. https://doi.org/10.17632/234yvmkcys.1
Acknowledgements
The authors would like to acknowledge Prince Sultan University and Smart Systems Engineering lab for their valuable support. This work was supported by the research grant [grant number SEED-2023-CE-142]; Prince Sultan University; Saudi Arabia.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sedik, A., Kolivand, H. & Albeedan, M. An efficient image classification and segmentation method for crime investigation applications. Multimed Tools Appl (2024). https://doi.org/10.1007/s11042-024-19773-w
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11042-024-19773-w