Object Recognition in Baggage Inspection Using Adaptive Sparse Representations of X-ray Images

Mery, Domingo; Svec, Erick; Arias, Marco

doi:10.1007/978-3-319-29451-3_56

Domingo Mery¹⁷,
Erick Svec¹⁷ &
Marco Arias¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9431))

Included in the following conference series:

Image and Video Technology

3404 Accesses
19 Citations

Abstract

In recent years, X-ray screening systems have been used to safeguard environments in which access control is of paramount importance. Security checkpoints have been placed at the entrances to many public places to detect prohibited items such as handguns and explosives. Human operators complete these tasks because automated recognition in baggage inspection is far from perfect. Research and development on X-ray testing is, however, ongoing into new approaches that can be used to aid human operators. This paper attempts to make a contribution to the field of object recognition by proposing a new approach called Adaptive Sparse Representation (XASR+). It consists of two stages: learning and testing. In the learning stage, for each object of training dataset, several random patches are extracted from its X-ray images in order to construct representative dictionaries. A stop-list is used to remove very common words of the dictionaries. In the testing stage, random test patches of the query image are extracted, and for each test patch a dictionary is built concatenating the ‘best’ representative dictionary of each object. Using this adapted dictionary, each test patch is classified following the Sparse Representation Classification (SRC) methodology. Finally, the query image is classified by patch voting. Thus, our approach is able to deal with less constrained conditions including some contrast variability, pose, intra-class variability, size of the image and focal distance. We tested the effectiveness of our method for the detection of four different objects. In our experiments, the recognition rate was more than 95 % in each class, and more than 85 % if the object is occluded less than 15 %. Results show that XASR+ deals well with unconstrained conditions, outperforming various representative methods in the literature.

You have full access to this open access chapter, Download conference paper PDF

Object Recognition in X-ray Testing Using Adaptive Sparse Representations

Article 26 July 2016

Handgun Detection in Single-Spectrum Multiple X-ray Views Based on 3D Object Recognition

Article 06 July 2019

Dual-energy X-ray Imaging in Combination with Automated Threshold Gabor Filtering for Baggage Screening Application

Article 01 September 2020

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Baggage inspection using X-ray screening is a priority task that reduces the risk of crime, terrorist attacks and propagation of pests and diseases [1]. Security and safety screening with X-ray scanners has become an important process in public spaces and at border checkpoints [2]. However, inspection is a complex task because threat items are very difficult to detect when placed in closely packed bags, occluded by other objects, or rotated, thus presenting an unrecognizable view [3]. Manual detection of threat items by human inspectors is extremely demanding [4]. It is tedious because very few bags actually contain threat items, and it is stressful because the work of identifying a wide range of objects, shapes and substances (metals, organic and inorganic substances) takes a great deal of concentration. In addition, human inspectors receive only minimal technological support. Furthermore, during rush hours, they have only a few seconds to decide whether or not a bag contains a threat item [5]. Since each operator must screen many bags, the likelihood of human error becomes considerable over a long period of time even with intensive training. The literature suggests that detection performance is only about 80–90 % [6]. In baggage inspection, automated X-ray testing remains an open question due to: (i) loss of generality, which means that approaches developed for one task may not transfer well to another; (ii) deficient detection accuracy, which means that there is a fundamental tradeoff between false alarms and missed detections; (iii) limited robustness given that requirements for the use of a method are often met for simple structures only; and (iv) low adaptiveness in that it may be very difficult to accommodate an automated system to design modifications or different specimens.

There are some contributions in computer vision for X-ray testing such as applications on inspection of castings, welds, food, cargos and baggage screening [7]. For this research proposal, it is very interesting to review the advances in baggage screening that have taken place over the course of this decade. They can be summarized as follows: Some approaches attempt to recognize objects using a single view of mono-energy X-ray images (e.g., the adapted implicit shape model based on visual codebooks [8]) and dual-energy X-ray images (e.g., Gabor texture features [9], bag of words based on SURF features [10] and pseudo-color, texture, edge and shape features [11]). More complex approaches that deal with multiple X-ray images have been developed as well. In the case of mono-energy imaging, see for example the recognition of regular objects using data association in [12] and active vision [13] where a second-best view is estimated. In the case of dual-energy imaging, see the use of visual vocabularies and SVM classifiers in [14]. Progress also has been made in the area of computed tomography. For example, in order to improve the quality of CT images, metal artifact reduction and de-noising [15] techniques were suggested. Many methods based on 3D features for 3D object recognition have been developed (see, for example, RIFT and SIFT descriptors [16], 3D Visual Cortex Modeling 3D Zernike descriptors and histogram of shape index [17]). There are contributions using known recognition techniques (see, for example, bag of words [18] and random forest [19]). As we can see, the progress in automated baggage inspection is modest and still very limited compared to what is needed because X-ray screening systems are still being manipulated by human inspectors. Automated recognition in baggage inspection is far from being perfected given that the appearance of the object of interest can become extremely difficult due to problems of (self-)occlusion, noise, acquisition, clutter, etc.

We believe that algorithms based on sparse representations can be used for this general task because in many computer vision applications, under the assumption that natural images can be represented using sparse decomposition, state-of-the-art results have been significantly improved [20]. Thus, it is possible to cast the problem of recognition into a supervised recognition form with X-ray images images and class levels (e.g., objects to be recognized) using learned features in a unsupervised way. In the sparse representation approach, a dictionary is built from the training X-ray images, and matching is done by reconstructing the query image using a sparse linear combination of the dictionary. Usually, the query image is assigned to the class with the minimal reconstruction error.

Reflecting on the problems confronting recognition of objects, we believe that there are some key ideas that should be present in new proposed solutions. First, it is clear that certain parts of the objects are not providing any information about the class to be recognized (for example occluded parts). For this reason, such parts should be detected and should not be considered by the recognition algorithm. Second, in recognizing any class, there are parts of the object that are more relevant than other parts (for example the sharp parts when recognizing sharp objects like knives). For this reason, relevant parts should be class-dependent, and could be found using unsupervised learning. Third, in the real-world environment, and given that X-ray images are not perfectly aligned and the distance between detector and objects can vary from capture to capture, analysis of fixed parts can lead to misclassification. For this reason, feature extraction should not be in fixed positions, and can be in several random positions. Moreover, it would be possible to use a selection criterion that enables selection of the best regions. Fourth, an object that is present in a query image can be subdivided into ‘sub-objects’, for different parts (e.g., in case of a handgun there are trigger, muzzle, grip, etc.). For this reason, when searching for images of the same class it would be helpful to search for image parts in all images of the training images instead of similar training images.

Inspired by these key ideas, we propose a method for recognition of objects using X-ray images^{Footnote 1}. Three main contributions of our approach are: (1) A new general algorithm that is able to recognize regular objects: it has been evaluated in the recognition of four different objects. (2) A new representation for the classes to be recognized using random patches: this is based on representative dictionaries learned for each class of the training images, which correspond to a rich collection of representations of selected relevant parts that are particular to a specific class. (3) A new representation for the query X-ray image: this is based on (i) a discriminative criterion that selects the ‘best’ test patches extracted randomly from the query image and (ii) and an ‘adaptive’ sparse representation of the selected patches computed from the ‘best’ representative dictionary of each class. Using these new representations, the proposed method (XASR+) can achieve high recognition performance under many complex conditions, as shown in our experiments.

2 Proposed Method

The proposed XASR+ method consists of two stages: learning and testing (see Fig. 1). In the learning stage, for each object of the training, several random patches are extracted and described from their images in order to built representative dictionaries. In the testing stage, random test patches of the query image are extracted and described, and for each test patch a dictionary is built concatenating the ‘best’ representative dictionary of each object. Using this adapted dictionary, each test patch is classified in accordance with the Sparse Representation Classification (SRC) methodology [22]. Afterwards, the patches are selected according to a discriminative criterion. Finally, the query image is classified by voting for the selected patches. Both stages will be explained in this section in further detail.

2.1 Model Learning

In the training stage, a set of n object images of k objects is available, where $\mathbf{I}_j^i$ denotes X-ray image j of object i (for $i=1 \dots k$ and $j=1 \dots n$) as illustrated in Fig. 2. In each image $\mathbf{I}_j^i$, m patches ${\mathcal P}_{jp}^i$ of size $w \times w$ pixels (for $p=1 \dots m$) are randomly extracted. They are centered in $(x_{jp}^i,y_{jp}^i)$. In this work, a patch ${\mathcal P}$ is defined as vector:

$$\begin{aligned} \mathbf{p} = [ \ \mathbf{z} \ ; \ \alpha r ] \in \mathcal {R}^{d+1} \end{aligned}$$

(1)

where $\mathbf{z} = g({\mathcal P}) \in \mathcal {R}^d$ is a descriptor of patch ${\mathcal P}$ (i.e.,a local descriptor of d elements extracted from the patch); r is the distance of the center of the patch $(x_{jp}^i,y_{jp}^i)$ to the center of the image; and $\alpha $ is a weighting factor between descriptor and location. Description $\mathbf{z}$ must be rotation invariant because the orientation of the object can be anyone. Patch ${\mathcal P}$ is described using a vector that has been normalized to unit length:

$$\begin{aligned} \mathbf{y} = f(\mathcal{P}) = \frac{\mathbf{p}}{|| \mathbf{p} ||} \in \mathcal{R}^{d+1} \end{aligned}$$

(2)

In order to eliminate non-discriminative patches, a stop-list is computed from a visual vocabulary. The visual vocabulary is built using all descriptors $\mathbf{Z} = \{ \mathbf{z}_{jp}^i \} \in \mathcal {R}^{d \times knm}$, for $i=1 \dots k$, for $j=1 \dots n$ and for $p=1 \dots m$. Array $\mathbf{Z}$ is clustered using a k-means algorithm in $N_v$ clusters. Thus, a visual vocabulary containing $N_v$ visual words is obtained. In order to construct the stop-list, the term frequency ‘$t_f$’ is computed: $t_f(d,v)$ is defined as the number of occurrences of word v in document d, for $d = 1 \dots K$, $v=1 \dots N_v$. In our case, a document corresponds to an X-ray image, and $K=kn$ is the number of classes in the training dataset. Afterwards, the document frequency ‘$d_f$’ is computed: $d_f(v) = \sum _d \{ t_f(d,v)>0 \}$, i.e., the number of images in the training dataset that contain a word v, for $v=1 \dots N_v$. The stop-list is built using words with highest and smallest $d_f$ values: On one hand, visual words with highest $d_f$ values are not discriminative because they occur in almost all images. On the other hand, visual words with smallest $d_f$ are so unusual that they correspond in most of the cases to noise. Usually, the top 5 % and bottom 10 % are stopped [23]. Those patches of $\mathbf{Z}$ that belong to the stopped clusters are not considered in the following steps of our algorithm.

Using (2) all extracted patches are described as $\mathbf{y}_{jp}^i = f({\mathcal P}_{jp}^i)$. Thus, for object i an array with the description of all patches is defined as $\mathbf{Y}^i = \{ \mathbf{y}_{jp}^i \} \in \mathcal {R}^{(d+1) \times nm}$ (for $j=1 \dots n$ and $p=1 \dots m$).

The description $\mathbf{Y}^i$ of object i is clustered using k-means algorithm in Q clusters that will be referred to as parent clusters:

$$\begin{aligned} \mathbf{c}_q^i = \text {kmeans} (\mathbf{Y}^i, Q) \end{aligned}$$

(3)

for $q = 1 \dots Q$, where $\mathbf{c}_q^i \in \mathcal {R}^{(d+1)}$ is the centroid of parent cluster q of object i. We define $\mathbf{Y}_q^i$ as the array with all samples $\mathbf{y}_{jp}^i$ that belong to the parent cluster with centroid $\mathbf{c}_q^i$. In order to select a reduced number of samples, each parent cluster is clustered again in R child clusters:

$$\begin{aligned} \mathbf{c}_{qr}^i = \text {kmeans} (\mathbf{Y}_q^i, R) \end{aligned}$$

(4)

for $r = 1 \dots R$, where $\mathbf{c}_{qr}^i \in \mathcal {R}^{(d+1)}$ is the centroid of child cluster r of parent cluster q of object i. All centroids of child clusters of object i are arranged in an array $\mathbf{D}^i$, and specifically for parent cluster q are arranged in a matrix:

$$\begin{aligned} \mathbf{\bar{A}}_q^i = [ \mathbf{c}_{q1}^i \dots \ \mathbf{c}_{qr}^i \dots \ \mathbf{c}_{qR}^i ]^\mathsf{T} \in \mathcal {R}^{(d+1) \times R} \end{aligned}$$

(5)

Thus, this arrange contains R representative samples of parent cluster q of object i as illustrated in Fig. 3. The set of all centroids of child clusters of object i ($\mathbf{D}^i$), represents Q representative dictionaries with R descriptions $\{ \mathbf{c}^i_{qr} \}$ for $q=1 \dots Q, r = 1 \dots R$.

2.2 Testing

In the testing stage, the task is to determine the identity of the query image $\mathbf{I}^t$ given the model learned in the previous section. From the test image, s selected test patches ${\mathcal P}_{p}^t$ of size $w \times w$ pixels are extracted and described using (2) as $\mathbf{y}_{p}^t = f({\mathcal P}_{p}^t)$ (for $p=1 \dots s$). The selection criterion of a test patch will be explained later in this section.

For each selected test patch with description $\mathbf{y} = \mathbf{y}_{p}^t$, a distance to each parent cluster q of each object i of the training dataset is measured:

$$\begin{aligned} h^i(\mathbf{y},q) = \text {distance} (\mathbf{y} , \mathbf{\bar{A}}_q^i ). \end{aligned}$$

(6)

We tested with several distance metrics. The best performance, however, was obtained by:

$$\begin{aligned} h^i(\mathbf{y},q) = \text {min}_r ||\mathbf{y} - \mathbf{c}_{qr}^i|| \ \ \text {for } r=1 \dots R, \end{aligned}$$

(7)

which is the smallest Euclidean distance to centroids of child clusters of parent cluster q as illustrated in Fig. 4. For $\mathbf{y}$ and $\mathbf{c}_{qr}^i$ normalized to unit $\ell _2$ norm, the following distance can be used based on (7):

$$\begin{aligned} h^i(\mathbf{y},q) = \text {min}_r ( 1 - < \mathbf{y} , \mathbf{c}_{qr}^i > ) \ \ \text {for } r=1 \dots R, \end{aligned}$$

(8)

where the term $< \bullet >$ corresponds to the scalar product that provides a similarity (cosine of angle) between vectors $\mathbf{y}$ and $\mathbf{c}_{qr}^i$. The parent cluster that has the minimal distance is searched:

$$\begin{aligned} {\hat{q}}^i = \mathop {\text {argmin}}\limits _{q} \ h^i(\mathbf{y},q), \end{aligned}$$

(9)

which minimal distance is $h^i(\mathbf{y},{{\hat{q}}^i})$.

For patch $\mathbf{y}$, we select those training objects that have a minimal distance less than a threshold $\theta $ in order to ensure a similarity between the test patch and representative object patches. If $k'$ objects fulfill the condition $h^i(\mathbf{y},{{\hat{q}}^i}) < \theta $ for $i=1 \dots k$, with $k' \le k$, we can build a new index $v_{i'}$ that indicates the index of the $i'$-th selected object for $i'=1 \dots k'$. For instance in a training dataset with $k=4$ objects, if $k'=3$ objects are selected (e.g., objects 1, 3 and 4), then the indices are $v_1=1$, $v_2=3$ and $v_3=4$ as illustrated in Fig. 4. The selected object $i'$ for patch $\mathbf{y}$ has its dictionary $\mathbf{D}^{v_{i'}}$, and the corresponding parent cluster is $u_{i'} = {{\hat{q}}^{v_{i'}}}$, in which child clusters are stored in row $u_{i'}$ of $\mathbf{D}^{v_{i'}}$, i.e., in $\mathbf{A}^{i'} := \mathbf{\bar{A}}_{u_{i'}}^{v_{i'}}$.

Therefore, a dictionary for patch $\mathbf{y}$ is built using the best representative patches as follows (see Fig. 4):

$$\begin{aligned} \mathbf{A}(\mathbf{y}) = [ \ \mathbf{A}^1 \dots \mathbf{A}^{i'} \dots \mathbf{A}^{k'} \ ] \in \mathcal {R}^{(d+1) \times Rk'} \end{aligned}$$

(10)

With this adaptive dictionary $\mathbf{A}$, built for patch $\mathbf{y}$, we can use Sparse Representation Classification (SRC) methodology [22]. That is, we look for a sparse representation of $\mathbf{y}$ using the $\ell _1$-minimization approach:

$$\begin{aligned} \mathbf{\hat{x}} = \text {argmin} ||\mathbf{x}||_1 \ \ \ \text {object to} \ \ \ \mathbf{A}{} \mathbf{x} = \mathbf{y} \end{aligned}$$

(11)

The residuals are calculated for the reconstruction for the selected objects ${i'}=1 \dots k'$:

$$\begin{aligned} r_{i'}(\mathbf{y}) = || \mathbf{y} - \mathbf{A} \delta _{i'} (\mathbf{\hat{x}}) || \end{aligned}$$

(12)

where $\delta _{i'} (\mathbf{\hat{x}})$ is a vector of the same size of $\mathbf{\hat{x}}$ whose only nonzero entries are the entries in $\mathbf{\hat{x}}$ corresponding to class $v(i')=v_{i'}$. Thus, the class of selected test patch $\mathbf{y}$ will be the class that has the minimal residual, that is it will be

$$\begin{aligned} {\hat{i}} (\mathbf{y}) = v({\hat{i'}}) \end{aligned}$$

(13)

where $\hat{i'} = \text {argmin}_{i'} r_{i'}(\mathbf{y})$.

Finally, the identity of the query object will be the majority vote of the classes assigned to the s selected test patches $\mathbf{y}^t_p$, for $p=1 \dots s$:

$$\begin{aligned} \text {identity} ( \mathbf{I}^t ) = \text {mode} ( \hat{i}(\mathbf{y}^t_1), \dots \hat{i}(\mathbf{y}^t_p), \dots \hat{i}(\mathbf{y}^t_s)) \end{aligned}$$

(14)

The selection of s patches of query image is as follows:

(i) From query image $\mathbf{I}^t$, m patches are randomly extracted and described using (2): $\mathbf{y}^t_j$, for $j=1 \dots m$, with $m \ge s$.
(ii) Each patch $\mathbf{y}^t_j$ is represented by $\mathbf{\hat{x}}^t_j$ using the mentioned adaptive sparse representation according to (11).
(iii) The sparsity concentration index (SCI) of each patch is computed in order to evaluate how spread are its sparse coefficients [22]. SCI is defined by
$$\begin{aligned} S_j := \text {SCI}(\mathbf{y}^t_j) = \frac{k \ {\max (|| \delta _{i'} ({\hat{\mathbf{x}}^t_j}) ||_1)}/{|| {\hat{\mathbf{x}}^t_j} ||_1}-1}{k-1} \end{aligned}$$
(15)
If a patch is discriminative enough it is expected that its SCI is large. Note that we use k instead of $k'$ because the concentration of the coefficients related to k classes must be measured.
(iv) Array $\{ S \}_{j=1}^m$ is sorted in a descended way.
(v) The first s patches in this sorted list in which SCI values are greater than a $\tau $ threshold are then selected. If only $s'$ patches are selected, with $s' < s$, then the majority vote decision in (14) will be taken with the first $s'$ patches.

3 Experiments

Our method was tested in the recognition of five classes in baggage screening: handguns, shuriken (ninja stars), clips, razor blades and background (see some samples in Fig. 5). In our experiments, there are 100 X-ray images per class. All images were resized to 128 $\times $ 128 pixels. We defined the following protocol: from each class, 50 images were randomly chosen for training and one for testing. In order to obtain a better confidence level in the estimation of recognition accuracy^{Footnote 2}, the test was repeated 100 times by randomly selecting new 51 images per class each time (50 for training and 1 for testing). The reported accuracy in all of our experiments is the average calculated over the 100 tests^{Footnote 3}.

The descriptor used by our method was LBP$_{8,1}^{ri}$, i.e.,Local Binary Pattern rotation-invariant with 8 samples and radius 1 [25]. That yields a 36-bin descriptor ($d=36$). The size of the patch was 24 $\times $ 24 pixels ($w=24$).

Table 1. Accuracy [%] of each experiment

Full size table

In order to evaluate the robustness against occlusion, we corrupted the test images with a square of random gray value of size $a \times a$ pixels located randomly, for $a=15, 30, 50, 70$ (see example in Table 1). The obtained result is given in first row of Table 1 (see XASR+’s row). We observe that the accuracy was more than 95 % in each class when there is no occlusion, and more than 80 % if the object is occluded less than 30 %.

In order to evaluate the effectiveness of the stop-list, we repeated the same experiment without considering this step. The results are given in the second row of Table 1 (see XASR’s row). We observe that the use of a stop-list can increase the accuracy significantly.

In addition, we compared our method with four known methods that can be used in object recognition: (i) SIFT [26], (ii) sparse representation classification (SRC) [22] with SIFT descriptors, (iii) efficient visual search based on an information retrieval approach (Vgoogle) [23], and (iv) bag of words [27] using KNN (BoW-KNN) and random forest (BoW-RF) [28] with SIFT descriptors. We coded these methods according to the specifications given by the authors in their papers. The parameters were set so as to obtain the best performance. The results are summarized in the corresponding rows of Table 1. Results show that XASR+ deals well with unconstrained conditions in every experiment, achieving a high recognition performance in many conditions and obtaining similar or better performance in comparison with other representative methods in the literature.

The time computing depends on the size of the dictionary that is proportional to the number of classes to be detected. In our experiments with 5 classes the computational time is about 0.2 s per testing image (testing stage) on a Mac Mini Server OS X 10.10.1, processor 2.6 GHz Intel Core i7 with 4 cores and memory of 16GB RAM 1600 MHz DDR3.

4 Conclusions

In this paper, we have presented XASR+, an algorithm that is able to recognize objects automatically in cases with less constrained conditions including some contrast variability, pose, intra-class variability, size of the image and focal distance. We tested the effectiveness of our method for the detection of four different objects: razor blades, shuriken (ninja stars) handguns and clips. In our experiments, the recognition rate was more than 95 % in every class. The robustness of our algorithm is due to three reasons: (i) the dictionaries learned for each class in the learning stage corresponded to a rich collection of representations of relevant parts which were selected and clustered; (ii) the testing stage was based on adaptive sparse representations of several random patches using the dictionaries estimated in the previous stage which provided the best match with the patches, and (iii) a visual vocabulary and a stop-list used to reject non-discriminative patches in both learning and testing stage.

Notes

1.
A similar approach was developed by us for a biometric problem [21].
2.
Ratio of correctly classified samples to the total number of samples.
3.
The code for the MATLAB implementation is available on our webpage http://dmery.ing.puc.cl/index.php/material/. The X-ray images belong to GDXray database [24].

References

Zentai, G.: X-ray imaging for homeland security. In: IEEE International Workshop on Imaging Systems and Techniques (IST 2008), pp. 1–6 (September 2008)
Google Scholar
Parliament, E.: Aviation security with a special focus on security scanners. European Parliament Resolution (2010/2154(INI)), pp. 1–10 (October 2012)
Google Scholar
Halbherr, T., Schwaninger, A., Bolfing, A.: How Image Based Factors and Human Factors Contribute to Threat Detection Performance in X-Ray Aviation Security Screening. In: Holzinger, A. (ed.) USAB 2008. LNCS, vol. 5298, pp. 419–438. Springer, Heidelberg (2008)
Chapter Google Scholar
Schwaninger, A., Bolfing, A., Halbherr, T., Helman, S., Belyavin, A., Hay, L.: The impact of image based factors and training on threat detection performance in X-ray screening. In: Proceedings of the 3rd International Conference on Research in Air Transportation, ICRAT 2008, pp. 317–324 (2008)
Google Scholar
Blalock, G., Kadiyali, V., Simon, D.H.: The Impact of post-9/11 airport security measures on the demand for air travel. J. Law Econ. 50(4), 731–755 (2007)
Article Google Scholar
Michel, S., Koller, S., de Ruiter, J., Moerland, R., Hogervorst, M., Schwaninger, A.: Computer-based training increases efficiency in X-ray image interpretation by aviation security screeners. In: 2007 41st Annual IEEE International Carnahan Conference on Security Technology, pp. 201–206 (October 2007)
Google Scholar
Mery, D.: Computer Vision for X-Ray Testing. Springer, Heidelberg (2015)
Book Google Scholar
Riffo, V., Mery, D.: Automated detection of threat objects using adapted implicit shape model. IEEE Trans. Syst. Man Cybern. Syst. (2015, in press)
Google Scholar
Uroukov, I., Speller, R.: A preliminary approach to intelligent x-ray imaging for baggage inspection at airports. Sig. Process. Res. 4, 1–11 (2015)
Article Google Scholar
Turcsany, D., Mouton, A., Breckon, T.P.: Improving feature-based object recognition for X-ray baggage security screening using primed visualwords. In: IEEE International Conference on Industrial Technology (ICIT 2013), pp. 1140–1145
Google Scholar
Zhang, N., Zhu, J.: A study of x-ray machine image local semantic features extraction model based on bag-of-words for airport security. Int. J. Smart Sens. Intell. Syst. 8(1), 45–64 (2015)
Google Scholar
Mery, D.: Inspection of complex objects using multiple-x-ray views. IEEE/ASME Trans. Mech. 20(1), 338–347 (2015)
Article MathSciNet Google Scholar
Riffo, V., Mery, D.: Active x-ray testing of complex objects. Insight-Non-Destr. Test. Condition Monit. 54(1), 28–35 (2012)
Article Google Scholar
Schmidt, U., Roth, S., Franzel, T.: Object Detection in Multi-view X-Ray Images. In: Pinz, A., Pock, T., Bischof, H., Leberl, F. (eds.) DAGM and OAGM 2012. LNCS, vol. 7476, pp. 144–154. Springer, Heidelberg (2012)
Chapter Google Scholar
Mouton, A., Flitton, G.T., Bizot, S.: An evaluation of image denoising techniques applied to CT baggage screening imagery. In: IEEE International Conference on Industrial Technology (ICIT 2013), IEEE (2013)
Google Scholar
Flitton, G., Breckon, T.P., Megherbi, N.: A comparison of 3D interest point descriptors with application to airport baggage object detection in complex CT imagery. Pattern Recogn. 46(9), 2420–2436 (2013)
Article Google Scholar
Megherbi, N., Han, J., Breckon, T.P., Flitton, G.T.: A comparison of classification approaches for threat detection in CT based baggage screening. In: 2012 19th IEEE International Conference on Image Processing (ICIP), pp. 3109–3112. IEEE (2012)
Google Scholar
Flitton, G., Mouton, A., Breckon, T.P.: Object classification in 3D baggage security computed tomography imagery using visual codebooks. Pattern Recogn. 48(8), 2489–2499 (2015)
Article Google Scholar
Mouton, A., Breckon, T.P.: Materials-based 3D segmentation of unknown objects from dual-energy computed tomography imagery in baggage security screening. Pattern Recogn. 48(6), 1961–1978 (2015)
Article Google Scholar
Tosic, I., Frossard, P.: Dictionary learning. IEEE Sig. Process. Mag. 28(2), 27–38 (2011)
Article Google Scholar
Mery, D., Bowyer, K.: Automatic facial attribute analysis via adaptive sparse representation of random patches. Pattern Recogn. Lett. 68(Part 2), 260–269 (2015)
Article Google Scholar
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210–227 (2009)
Article Google Scholar
Sivic, J., Zisserman, A.: Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 591–606 (2009)
Article Google Scholar
Mery, D., Riffo, V., Zscherpel, U., Mondragón, G., Lillo, I., Zuccar, I., Lobel, H., Carrasco, M.: GDXray: The database of X-ray images for nondestructive testing. J. Nondestr. Eval. 34(4), 1–12 (2015)
Article Google Scholar
Ojala, T., Pietikäinen, M., Mäenpää, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Article MATH Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, vol. 1, pp. 1–2, Prague (2004)
Google Scholar
Moosmann, F., Triggs, B., Jurie, F.: Fast discriminative visual codebooks using randomized clustering forests. In: Twentieth Annual Conference on Neural Information Processing Systems (NIPS 2006), pp. 985–992. MIT Press (2007)
Google Scholar

Download references

Acknowledgments

This work was supported by Fondecyt Grant No. 1130934 from CONICYT, Chile.

Author information

Authors and Affiliations

Departamento de Ciencia de la Computacin, Pontificia Universidad Catlica de Chile, Av. Vicua Mackenna, 4860(143), Santiago de Chile, Chile
Domingo Mery, Erick Svec & Marco Arias

Authors

Domingo Mery
View author publications
You can also search for this author in PubMed Google Scholar
Erick Svec
View author publications
You can also search for this author in PubMed Google Scholar
Marco Arias
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Domingo Mery .

Editor information

Editors and Affiliations

The University of Western Australia, Crawley, Perth, West Australia, Australia
Thomas Bräunl
University of Otago, Dunedin, New Zealand
Brendan McCane
en Matematicas A.C., Centro de Investigación, Guanajuato, Mexico
Mariano Rivera
Central China Normal University, Wuhan, Hubei, China
Xinguo Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mery, D., Svec, E., Arias, M. (2016). Object Recognition in Baggage Inspection Using Adaptive Sparse Representations of X-ray Images. In: Bräunl, T., McCane, B., Rivera, M., Yu, X. (eds) Image and Video Technology. PSIVT 2015. Lecture Notes in Computer Science(), vol 9431. Springer, Cham. https://doi.org/10.1007/978-3-319-29451-3_56

Download citation

DOI: https://doi.org/10.1007/978-3-319-29451-3_56
Published: 04 February 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29450-6
Online ISBN: 978-3-319-29451-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)