Two-stage approach to extracting visual objects from paper documents
In the paper we present an approach to the automatic detection and identification of important elements in paper documents. This includes stamps, logos, printed text blocks, signatures and tables. Presented approach consists of two stages. The first one includes object detection by means of AdaBoost cascade of weak classifiers and Haar-like features. Resulting image blocks are, at the second stage, subjected to verification based on selected features calculated from recently proposed low-level descriptors combined with certain classifiers representing current machine-learning approaches. The training phase, for both stages, uses bootstrapping, i.e., integrative process, aiming at increasing the accuracy. Experiments performed on large set of digitized paper documents showed that adopted strategy is useful and efficient.
KeywordsDocument segmentation Object detection Classifiers cascade Visual descriptors Low-level features
From the historical point of view, paper document has been one of the basic means of human communication across ages. Although the information in such documents is represented in different languages, structures and forms, they often contain common elements such as stamps, signatures, tables, logos, blocks of text and background. It can be seen that in order to prevent document accumulation, most of valuable pieces are digitally scanned and kept as digital copies. Storing data this way makes process of document organizing, accessing and exchange easier but, even then without a managing system it is difficult to keep things in order. In the paper we present an approach to extract characteristic visual objects from paper document. According to  such an approach that is able to recognize digitized paper document may be used to transform it into hierarchical representation in terms of structure and content, which would allow for an easier exchange, editing, browsing, indexing, filling and retrieval.
Our algorithm can be a part of a document managing system, whose main purpose is to determine parts of the document that should be processed further (e.g., text ) or to be a subject of enhancement and denoising (e.g., graphics, pictures, charts ). It could be an integral part of any content-based image retrieval system, or simply a filter that would select only documents containing specific elements , segregate them in terms of importance (colored documents containing stamps and signatures are more valuable than monochromatic ones, which suggest a copy [5, 10]), etc. Presented approach is document-type independent; hence, it can be applied to any formal documents, diploma, newspapers, postcards, envelopes, bank checks, etc.
The paper is organized as follows: first we review related works and point out their characteristic features; then, we demonstrate both stages of the algorithm and finally, we present selected experimental results. We conclude the paper with an in-depth discussion.
2 Previous works
Literature survey indicates that the problem examined in this paper has been a subject of study for about three decades (a Google Scholar search reveals that the first paper containing phrase “page segmentation” dates back to 1985). The first, extensive survey of page segmentation and zone classification methods, which are the closest problems, has been done by Okun et al.  and covers papers from 1990 to 1999. In recent years, many more ideas have been further developed. Hence, in the following sections global (multi-class element detection and classification) and individual (class-specific detection and classification) approaches are discussed, as the most popular. We provide also a short review of other two-stage approaches as a general idea of computer vision.
2.1 Global approach
According to Okun et al.  so-called global approaches can be divided into three categories of methods: bottom-up, top-down and heuristic. Top-down methods can be useful when it comes to documents of initially known structure. Whole document constitutes as an input to top-down algorithms. It is then decomposed into smaller elements such as blocks and lines of text, single words and characters. Bottom-up strategy starts with a pixel-level analysis, and then pixels with common properties are grouped into bigger structures. Bottom-up techniques show their advantages when dealing with documents of various structure, but due to their complexity are often slower. Heuristic procedures attempt to combine robustness of top-down approaches and accuracy of bottom-up methods.
Connected component analysis is the most popular approach among bottom-up methods. Small groups of pixels are aggregated into bigger regions based on their proximity, localization and size. This process is accompaniment by smearing, nearest neighbor search and Voronoi diagram techniques for component grouping. Described algorithms are quite robust to skew, but depending on selected measures the processing cost may vary .
Bottom-up strategy is shown in , where documents are segmented into three classes (background, graphics and text). A sliding window technique is used to segment input image into blocks. Each block is subjected to feature extraction stage. After an extensive analysis, Sauvola et al.  formulated a number of rules that act as a classifier (extending the rule set increases the number of classes). Blocks of the same label are grouped, and final bounding box is defined in iterative masking procedure. Reported accuracy of text detection stands at high 99 %. Unfortunately, the results for other classes were not provided. Very similar approach is presented in , but it uses different set of features that are calculated from gray-level co-occurrence matrix (GLCM), as well as k-means algorithm for grouping. Mean accuracy equals to 94 %.
In the same survey, a list of top-down strategies was provided. Most of them rely on run-length analysis performed on binarized, skew-corrected documents. As an example vertical and horizontal histogram (of run-length) profiles are examined in terms of valley occurrence, which represents white space between blocks. Other solutions include usage of Gaussian pyramid in combination with low-level features or Gabor-filtered pixel clustering.
Heuristic methods combine bottom-up and top-down strategies. Usage of XY-cuts algorithm for joining components of the same label, which were obtained through classification performed on run-length matrix statistics, is a perfect example of such combination. Another approach makes use of quad-tree adaptive split-and-merge operations  to group or divide regions of high and low homogeneity accordingly. An analysis of fractal signature value, which is lower for background than other elements, proves its usefulness while processing documents of high complexity.
When we consider zone classification as a separate issue, it will allow us to put more focus onto multi-class discrimination problem. Keysers et al.  proposed a discrimination into eight different classes. The paper provides a comparative analysis of commonly used features. Among them, Tamura’s histogram achieved the highest accuracy, but due to its computation complexity it was discarded in favor to less complex feature vectors. Reported error rate is equal to 2.1 %, but 72.7 % of logos and 31.4 % of tables were misclassified. Wang et al.  proposed 69-element feature vector, which was reduced to 25 elements during feature selection stage, which allowed to achieve mean accuracy of 98.45 %; however 84.64 % of logos and 72.73 % of “other” elements were misclassified.
2.2 Individual approach
Individual approach focuses on single class detection and recognition. It is based on classification of characteristic features, often in a scheme “one versus all”. In our previous works [5, 9] a similar problem of stamp detection and recognition was described in detail. It applies Hough line and circle transforms, color segmentation and heuristic techniques. As it was stated in above-mentioned literature survey, logo detection is a very similar problem and can be solved with a little tweak to our previously presented solution . Other authors propose to use key-point analyzing algorithms like Scale-Invariant Feature Transform (SIFT), Speeded-Up Robust Features (SURF) and Features from Accelerated Segment Test (FAST) or Angular Radial Transform (ART). Two-step approaches similar to methods described in previous subsection are also highly popular.
Detection of text blocks can be realized by means of statistical analysis , edge extraction , texture analysis [12, 13]. Other authors made use of stroke filters [14, 15, 16], cosine transform  and LBP algorithm .
It should be noted that the intraclass variance of table objects is a huge problem, since they can be very complex. Typical table usually consists of a header and cells forming rows and columns. The number of cells, rows and columns depends on the volume of information contained. Moreover, font, ruling and background can be styled differently. In  Hu et al. focused on different kinds of mistakes that could be made during table detection. They also made major assumption that the input document contains only one column of text with easily separable, non-overlapping lines . Sameer et al.  proposed a solution based on line detection algorithm. Although their aim was to reconstruct tables, information on outermost line intersections could be used to determine table coordinates as well.
Signature and autograph detection methods may be derived from handwriting detection algorithms, but direct application of those methods is hampered by high intraclass variance caused by individual characteristic style of signatures . When it comes to signatures recognition, much more effort was put into biometric aspects such as recognition carried on beforehand, manually extracted images of signatures. Zhu et al.  proposed an algorithm consisting of extensive pre-processing, multi-scale signature saliency measure calculation for each connected component and area merging based on proximity and curvilinear constraints. High accuracy (92.8 %) was achieved on popular Tobacco-800 database.
Keypoint-based algorithms are also popular in terms of signature segmentation. In  SUFR algorithm was used to determine keypoint location on images containing results of connected component analysis performed on image with erased text (only signature is visible) and with erased signature. For each keypoint a feature vector is extracted and stored in appropriate database. Components of query document are labeled according to the closest example from both databases. Text tagged component is erased; thus, a segmented signature is revealed. Connected component analysis is crucial part of the solution presented in . The paper provides a comparative analysis of HOG, SIFT, gradient-based features, Local Ternary Patterns (LTP) and global low-level features. Classification is performed by SVM classifier. Experiments performed on Tobacco-800 database proved that the set containing gradient and low-level features was the best, achieving 95% accuracy.
Since in this paper only a selection of the most interesting methods was described, for a broad and recent literature survey on page segmentation and zone classification a reader is directed to the paper mentioned in the beginning .
2.3 Two-stage processing concept
We apply a two-stage approach to the page segmentation. This concept is definitely not novel in the computer vision field; however, in this particular task is rarely used. Similar ideas have been applied mostly to the problems of object detection, extraction and classification in other classes of digital images . In most of them, the idea comes from the assumption that the first processing stage performs a rough detection of objects of interest, while the second one applies more precise means to improve the identification accuracy . In many papers, the two-stage approach is related to the integration of features (e.g., appearance and spatio-temporal HOGs , difference-of-Gaussians and accumulated gradient projection vector , entropy of local histograms and heuristic features , edge information and SIFT features ), combining classifiers (e.g., SVM and random sample consensus—RANSAC , two stages of mean-shift clustering ), mixed approaches (e.g., Hough transform joined with DBSCAN clustering , edge map and SVM , HOG and SVM , two variants of snakes , particle swarm optimization and fuzzy classifier ).
The analysis of the literature shows that most of the algorithms often use image pre-processing techniques (e.g., document rectification), deal with restricted forms of analyzed documents (e.g., to checks) and employ sophisticated features together with multi-tier approaches. The other observation is that there is hardly any method aimed at the detection of all possible classes of visual objects in paper documents. It may be caused by non-trivial nature of the problem and different characteristics of analyzed graphical elements.
In the proposed approach, we do not apply any pre-processing and employ very efficient AdaBoost cascade which is implemented using integral image, hence giving very high processing speed. It should be stressed that we analyze probably most of all possible object types that can be found in documents, which has got no significant representation in the literature.
3 Algorithm description
In our approach we adopted an assumption that a successful extraction of visual objects from a paper document can be performed using a sequence of rather simple means. Hence, the developed algorithm consists of two subsequent stages. The first one is a rough detection of candidates, while the second one is a verification of found objects. The first stage is based on fast and simple approach, namely AdaBoost cascade of classifiers (employing Haar-like features). Since it results in significantly high number of false positives, it is supported with a verification stage using an additional classification employing a set of more complex features.
The training of the algorithm (see Fig. 1) in terms of detection and verification employs working in a iterative manner, which yields improved accuracy, depending on the quality and volume of learning sets. As it can be seen from Fig. 1 the reference documents dataset is subjected to manual cropping of interesting visual objects. This is an initialization of detector and verifier. Then, in each step (either detection or verification) the training involves fine-tuning and extending the learning sets. After that, the algorithm stops. In each iteration the learning set is being extended based on the results of accuracy verification.
3.1 Cascade training and detection
The training procedure is performed iteratively with bootstrapping. The first, preliminary training, is to initialize the classifier. For this stage we used manually selected positive and negative samples for each class, marked in images collected from Internet and from SigComp2009 . The number of objects was limited in order to lower the processing time, having in mind the assumption, that after this iteration, positive and negative samples will be determined automatically.
Number of samples used at the cascade training stage
3.2 Verification stage
Detected candidates are verified using a set of low-level features. The initial learning set, upon which reference features were calculated, consists of manually extracted 219 logos, 452 text blocks, 251 signatures, 1590 stamps, 140 tables and 719 background areas. As in case of detection, background blocks are used as negative examples and we do not verify background detection accuracy. After the initial investigations and the analysis of confusion matrices, in the second iteration of verification, we extended the learning set using extra 60 tables, 120 signatures and 50 text areas. Logotype and stamp classes together are quite numerous, and since their verification accuracy was acceptable, they were not extended.
It is a partial solution to the main observed problem; namely, many true-positive samples in signature and table classes were misclassified during the verification.
During our studies we selected eight feature sets, representing different approaches to low-level image description. They are presented in the following sections. Most of them (except binary version of LBP—LBPB) work on single-channel intensity images and do not relay on color information, which is an advantage.
3.2.1 First-order statistics (FOS)
3.2.2 Gray-level run-length statistics (GLRLS)
3.2.3 Haralick’s statistics (HS)
3.2.4 Neighboring gray-level dependence statistics (NGLDS)
3.2.5 Low-level features (LLF)
3.2.6 Histograms of oriented gradients (HOG)
3.2.7 Local binary patterns (LBP)
3.3 Dimensionality reduction
As it can be seen from Figs. 6, 7, 8, 9 and 10 many feature vectors have values that are common for all distinguished classes. It is probable that by eliminating them we can reduce the dimensionality of feature space while retaining recognition accuracy. That is why in the experiments we employed a substage of dimensionality reduction/feature selection, namely: principal component analysis (PCA) , linear discriminant analysis (LDA) , information gain (IG)  and least absolute shrinkage and selection operator (LASSO) . It is an improvement over a recent work .
As it can be seen, in some cases, only a fraction of calculated attributes were left (PCA), while in other cases the reduction algorithm selected more of them (less than half in case of LDA, more than half in case of IG and LASSO).
4.1 Detection stage
Lowest accuracy of signature detector results from different characteristics of examples used to train cascade (high resolution, bright and noise-free background, clear strokes, contrast ink) and the ones that are actually located on test documents (uneven background and ink color, often overlapping with other elements). Those observations were taken into account when preparing data for the second training iteration.
Analyzing the results in Table 2 one can see a significant increase in detection accuracy using a learning set obtained by two iterations of training procedure. After that, there is a significantly lower number of false detections, yet also slightly lower number of positive detections. A clearly visible significant increase in signatures detection rate is still far from ideal. It is caused by the fact that in most cases signatures are overlapped with other elements, such as stamps, text and signature lines.
4.2 Verification stage
Experiments described below were aimed at determining a combination of a classifier and a feature vector (from the selection presented in Sect. 3.2) that gives the highest possible verification accuracy, depending on the quality of input samples. The selection of classifiers we investigated consists of: 1-nearest neighbor (1NN), Naïve Bayes (NBayes), binary decision tree (CTree), support vector machine (SVM), general linear model regression (GLM) and classification and regression trees (CART). There were also two iterations of processing provided for comparison. In the first iteration, the learning set was composed of initial features calculated for manually selected samples. The verification at this stage involved a selected pair of feature vector and classifier employed on objects returned in the first iteration of detection stage (see Sect. 3.1). The second iteration of verification process employed an extended learning set (see Sect. 3.2) and a feature vector/classifier fed with an output returned after the second iteration of detection.
As it can be seen from above figures, logotypes are often classified as stamps. Similar confusion applies to tables which are sometimes classified as text areas. What is more, the most problematic are tables which contain or are overlapped with graphical elements (e.g., logotypes or stamps).
Stamps verification accuracy [%]
Logos verification accuracy [%]
Texts verification accuracy [%]
Signatures verification accuracy [%]
Tables verification accuracy [%]
4.3 Dimensionality reduction
Verification rate comparison for different dimensionality reduction methods [%]
As it can be seen, in most cases LASSO gives the highest accuracy; however, it is still lower than classification performed on a non-reduced features. Although the difference is not high, introducing these kinds of reductions may not be justified mainly because of additional computation overhead. The only exception is the case when we should conserve memory space, but nowadays it is not always crucial. The results of above experiment show that this substage may be omitted without loss of accuracy.
As it was shown in Table 4 verification accuracy of logo-detecting cascade after second iteration had decreased. Large number of detected samples were misclassified as negative instead of positive. This is due to quite rigorous character of classifiers used. Taking into account the accuracy of the detection process (which is also done through classification) a cascade could be assigned a higher decision weight than the best pair of feature set and classifier used in verification to compensate for low precision in verification stage. Similar situation occurs in case of tables—again high detection accuracy is combined with low verification result. This is caused mostly by fuzzy boundary separating tables containing text from pure text class.
Average accuracies achieved at both stages of stamps and texts processing mean that equal decision weight could be assigned to both cascade and best combination of feature set and classifier. In both cases high precision of detection is coupled with high verification result. It is important to note that tables filled with text were classified as text. Otherwise, the results would be much lower.
As it was noted, signature class causes most of the problems. Higher detection accuracy is only a result of much lower FP rate. This is caused by the extension of the learning set (both in training of cascade and at the verification stage). Further increase, especially in case of positive samples number, would be beneficial.
The analysis of presented verification results shows that all of discussed object classes should be considered separately. Unfortunately, it is impossible to point out a single pair of classifier/feature vector that wins in all cases. There seems to be no one rule that is behind above results.
In case of stamp class, the most accurate pair consists of GLM classifier and HC features set and a pair of 1NN classifier and HOG descriptor comes at second. Those pairs alternate between iterations. Analogous observations were made in case of the worst pair. In the first iteration, GLM classifier and NGLDS features were worst and NBayes \(+\) GLRS were second worst. Reverse relationship occurred in the second iteration. The average accuracy across all sets is equal to 60.17 and 53.3 % in first and second iterations, respectively. HS is the most accurate descriptor (average accuracy of 70.42 %) in the first iteration and HOG (with 61.82 % average accuracy) in the second. An accuracy of 63.51 % places CART classifier as the best in the first iteration, and 61.86 % places CTree classier at the top in the second iteration. Results for remaining classes were described in similar manner—first percentage value always corresponds to the result achieved in the first iteration and so on.
In both iterations of logo verification SVM classifier and GLRLS features set proved to be the best. There was no recurrence in case of the worst pair. Average accuracy is equal to 48.6 and 29.04 %. The highest average score was achieved by SVM classifier (53.31, 34.12 %) and HOG descriptor (54.58, 38.11 %).
Bayes-based classifiers, namely NBayes\(+\)GLRLS and NBayes\(+\)HOG, achieved the highest accuracies in the first and the second iteration of text verification process, respectively. Analogous switch in terms of the best and the second best as in case of stamp occurred. Overall accuracy stands at 55.52 and 52.99 %. The LBP and HOG descriptors proved to be the most accurate (67.99, 77.3 %). In both cases NBayes was selected as the best (65.42, 69.26 %).
The analysis of signature verification results is shown that GLM\(+\)LBP achieved high scores at both stages, only to be defeated by 1NN\(+\)HOG pair in the second iteration. Overall accuracy equals to 84.02 and 68.94 %. In both iterations the same feature set and classifier produced the highest scores: LBP (85.95, 71.8 %) and 1NN (84.63, 72.12 %).
Only in case of tables verification there is a significant domination of one classifier and feature set pair (NBayes\(+\)LLF) over all other combinations. Although the average accuracy is low (33.47 and 12.66 %), its value achieved by the best pair is satisfactory. NBayes classier paired with LLF feature set reached 69.62 and 54.17 %. Average classification accuracy of NBayes classifier is equal to 36.29 and 19.14 %, and for LLF features stands at 54.58, 43.81 % in the first and the second iterations, respectively.
4.5 Comparison with state-of-the-art methods
It is not easy to directly compare obtained results with other state-of-the-art methods, since the benchmark sets are very different. Moreover, the comparison with individual methods may not be justified because such methods employ class-specific approaches, which are tuned for particular object types. Hence, below, a not entirely meaningful comparison with certain, selected global approaches is provided. Taking into consideration average values, the detection accuracy in case of our algorithm is equal to 71.93 % and the verification accuracy (calculated for the best individual pairs) is equal to 78.48 %. When we exclude the most problematic class (in terms of detection), namely signatures, the detection accuracy rises to 82.61 % and the verification slightly drops to 76.59 %. It is because signatures are detected with a relatively low accuracy, yet their verification accuracy is quite high. In , the authors obtained an average detection accuracy equal to 81.84 %; however, when we consider only classes, that are similar to our case (however, without stamp class), the accuracy drops to 72.95 %. The main problem with that approach is a high number of misclassifications in case of tables. In our algorithm, tables are detected and verified with a very high accuracy. In  the mean accuracy for 9 classes is equal to 84.38 %. When we restrict the set in order to be similar to the one in our case (also without stamp class), it is equal to 89.11 %. The best result was obtained for printed text class, and again, the most problematic class is logotypes.
As it can be seen, our approach is comparable to the state-of-the-art approaches, while it features very intuitive processing flow and a significantly lower computational overhead. It also takes into consideration classes that are not analyzed in above-mentioned approaches, namely stamps and signatures. Having in mind increasing the learning datasets and introducing extra training iterations (at the detection stage), the accuracy may be even higher.
We have presented a novel approach to the extraction of visual objects from digitized paper documents. Its main contribution is a two-stage detection/verification idea based on iterative training and multiple features–classifiers pairs. As opposite to other known methods, the whole framework is common for various classes of objects. It also features classes that are not considered by other scientists in the global approaches, namely signatures and stamps. Performed extensive experiments showed that the whole idea proved to be valid. High accuracies achieved in in-depth analysis performed on large, real document set prove this fact further. Results from the second iteration (see Table 2) are particularly encouraging. Although there is a high similarity between some classes and numerous challenging examples throughout image database (see Fig. 16), the detection is successful. The signatures class is an exception, and the lower accuracy of detection/verification can be put down to the poor representation across databases. Increasing the size of learning set for signatures detection with high degree of probability would boost results as shown in case of the first and the second iteration.
High accuracies for certain classes in particular could lead to dropping the verification stage as it is redundant if cascade looks as like what it really is—a classifier itself. However, as long as there is more than a few of misclassified samples the use of this stage is justified. If we decide to use the verification stage, it is important to examine each class separately, as shown in previous section. It is well illustrated in Table 7. While overall accuracy is really low, accuracy for LLF feature set is several times higher than in case of any other feature set. As it was shown, the dimensionality reduction substage is not necessary, since it does not improve the classification accuracy.
- 2.Lech, P., Okarma, K.: Fast histogram based image binarization using the Monte Carlo threshold estimation. ICCVG’2014. LNCS vol. 8671, pp. 382–390 (2014)Google Scholar
- 3.Keysers, D., Shafait, F., Breuel, M.T.: Document image zone classification - a simple high-performance approach. 2nd International Conference on Computer Vision Theory and Applications. pp. 44–51 (2007)Google Scholar
- 5.Forczmański, P., Markiewicz, A.: Stamps detection and classification using simple features ensemble. Math. Probl. Eng. Article ID 367879 (2015)Google Scholar
- 6.Okun, O., Doermann, D., Pietikäinen, M.: Page Segmentation and Zone Classification: The State of the Art. Technical Report: LAMP-TR-036/CAR-TR-927/CS-TR-4079, University of Maryland, College Park (1999)Google Scholar
- 7.Sauvola, J., Pietikäinen, M.: Page Segmentation and Classification Using Fast Feature Extraction and Connectivity Analysis. ICDAR, 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition, pp. 1127–1131 (1995)Google Scholar
- 8.Lin, M.-W., Tapamo, J.-R., Ndovie, B.: A texture-based method for document segmentation and classification. S. Afr. Comput. J. 36, 49–56 (2006)Google Scholar
- 9.Forczmański, P., Markiewicz, A.: Low-Level Image Features for Stamps Detection and Classification. 8th International Conference on Computer Recognition Systems (CORES), Advances in Intelligent Systems and Computing 226, pp. 383–392 (2013)Google Scholar
- 10.Forczmański, P., Frejlichowski, D.: Robust Stamps Detection and Classification by Means of General Shape Analysis. International Conference on Computer Vision and Graphics (ICCVG). LNCS vol. 6374, pp. 360–367 (2010)Google Scholar
- 12.Pietikäinen, M., Okun, O.: Edge-based method for text detection from complex document images. Proceedings. Sixth International Conference on Document Analysis and Recognition, pp. 286–291 (2001)Google Scholar
- 15.Liu, Q., Jung, C., Kim, S., Moon, Y., Kim, J.: Stroke Filter for Text Localization in Video Images. IEEE International Conference on Image Processing, pp. 1473 – 1476 (2006)Google Scholar
- 16.Li, X., Wang, W., Jiang, S., Huang, Q., Gao, W.: Fast and effective text detection. 15th IEEE International Conference on Image Processing, pp. 969–972 (2008)Google Scholar
- 18.Ojala, T., Pietikäinen, M., Mäenpää, T.: Gray scale and rotation invariant texture classification with local binary patterns. In Proceedings of the 6th European Conference on Computer Vision, pp. 404–420 (2000)Google Scholar
- 20.Gatos, B., Danatsas, D., Pratikakis, I., Perantonis, S.J.: Automatic Table Detection in Document Images. Pattern Recogn. Data Min. LNCS 3686, 609–618 (2005)Google Scholar
- 22.Ahmed, S., Malik, M.I., Liwicki, M., Dengel, A.: Signature segmentation from document images International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 425–429 (2012)Google Scholar
- 23.Cüceloğlu, İ., Oğul, H.: Detecting handwritten signatures in scanned documents. Proceedings of the 19th Computer Vision Winter Workshop, pp. 89–94 (2014)Google Scholar
- 24.Li, S.Z., Hornegger, J.: A two-stage probabilistic approach for object recognition. Computer Vision – ECCV’98, LNCS 1407, pp. 733–747 (1998)Google Scholar
- 26.Mitsui, T, Fujiyoshi, H.: Object detection by joint features based on two-stage boosting. Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, pp. 1169–1176 (2009)Google Scholar
- 27.Chen, Y.-P., Yeh, T.-D.: A Method for Extraction and Recognition of Isolated License Plate Characters. International Journal of Computer Science and Information Security, Vol. 5, No.1 (2009)Google Scholar
- 28.Sikdar, A., Roy, P., Mukherjee, S., Das, M., Banerjee, S.: Two Stage Method for Bengali Text Extraction from Still Images Containing Text International Conference of Advanced Computer Science & Information Technology (ACSIT-2012), pp. 14 –15 (2012)Google Scholar
- 29.Jauregi, E., Lazkano, E., Sierra, B.: Object recognition using region detection and feature extraction. Proceedings of 10th International Conference Towards Autonomous Robotic Systems TAROS 2009, pp. 104–111 (2009)Google Scholar
- 30.Kuo, C.-H., Lee, J.-D.: A two-stage classifier using SVM and RANSAC for face recognition. TENCON IEEE Region 10 Conference, pp. 1–4 (2007)Google Scholar
- 31.Papić, V., Turić, H., Dujmić, H.: Two-stage segmentation for detection of suspicious objects in aerial and long- range surveillance applications. Proceedings of the 10th WSEAS International Conference on Automation & Information, pp. 152–156 (2009)Google Scholar
- 32.Niua, J., Lua, J., Xub, M., Lvb, P., Zhaoa, X.: Robust Lane Detection using Two-stage Feature Extraction with Curve Fitting. Pattern Recognition, in press, doi:10.1016/j.patcog.2015.12.010 (2015)
- 34.Han, F., Shan, Y., Cek, R., Sawhney, H.S., Kumar, R.: A two-stage approach to people and vehicle detection with HOG-based SVM. Performance Metrics for Intelligent Systems (PerMIS’06), 133–140 (2006)Google Scholar
- 36.Setayesh, M., Zhang, M., Johnston, M.: Feature Extraction and Detection of Simple Objects Using Particle Swarm Optimisation. Wellington Victoria University Technical Report Series no. 09-15 (2009)Google Scholar
- 37.Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 2001, pp. 511–518 (2001)Google Scholar
- 38.Burduk, R.: The AdaBoost Algorithm with the Imprecision Determine the Weights of the Observations, 6th Asian Conference Intelligent Information and Database Systems ACIIDS. LNCS 8398, pp. 110–116 (2014)Google Scholar
- 39.Liwicki, M.: ICDAR 2009 Signature Verification Competition. http://www.iapr-tc11.org/mediawiki/index.php/ICDAR_2009_Signature_Verification_Competition_(SigComp2009). Accessed 24 Feb 2015 (2009)
- 47.Dalal, N., Triggs, B.: Histograms of Oriented Gradients for Human Detection. International Conference on Computer Vision & Pattern Recognition, vol. 2, pp. 886–893 (2005)Google Scholar
- 49.Maturana, D., Mery, D., Soto, Á.: Face Recognition with Local Binary Patterns, Spatial Pyramid Histograms and Naive Bayes Nearest Neighbor Classification. Proceedings of the 2009 International Conference of the Chilean Computer Science Society, pp. 125–132 (2009)Google Scholar
- 50.Abdi, H., Williams, L.J.: Principal component analysis. Wiley Interdiscip. Rev Comput. Stat. 2(4), 433–459 (2010)Google Scholar
- 51.McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. Wiley Interscience, New York (2004)Google Scholar
- 52.Battiti, R.: Using mutual information for selecting features in supervised neural net learning. Neural Netw. IEEE Trans. 5(4), 537–550 (1994)Google Scholar
- 53.Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc Ser. B Methodol. 58(1), 267–288 (1996)Google Scholar
- 54.Markiewicz, A., Forczmański, P.: Detection and classification of interesting parts in scanned documents by means of adaBoost classification and low-level features verification. Comput. Anal. Images Patterns LNCS 9257, 529–540 (2015)Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.