Advertisement

Geometrical Transformation Invariant Approach for Classification of Signatures Using k-NN Classifier

  • Chandrima GangulyEmail author
  • Susovan Jana
  • Ranjan Parekh
Conference paper
  • 262 Downloads
Part of the Communications in Computer and Information Science book series (CCIS, volume 1030)

Abstract

Signature-based authentication of human is still very popular approach. Manual checking is not always accurate and it depends on expertise. The need is an automated and accurate system for signature classification. The signatures do not necessarily comprise of well-formed letters. It can be a random combination of curves and lines. The written signature may be of variable sized, inclined in arbitrary angle or misplaced. This makes the classification task more challenging. This paper proposes an automated approach of handwritten signature classification addressing those problems. The binarized version of the input image is pre-processed in various ways to compensate translation, rotation and noise removal. The four features, which does not vary due to scaling, are selected from the pre-processed image for the classification using k-NN classifier. Overall system accuracy of the proposed approach is 92% on a dataset of 100 images.

Keywords

Signature classification Gradient magnitude Corner point k-NN classifier 

1 Introduction

Modern life is becoming more complex day by day. It requires countless daily transactions in various fields of work like financial, administrative, legal, social, political, economic, business and travel. The rapid growth in population has assumed utmost importance for protecting the identity of individuals and ensuring the security of documents involved. Conventionally, identification and security of paper documents have been ensured using a signature based system which has been mostly verified manually by humans with relevant experience. However since the world is becoming faster every day, cutting down on time and increasing the reliability of verification systems, have become an absolute necessity. People are therefore relying more on automated systems and machine intelligence to perform verification tasks. Image processing and pattern recognition techniques have been employed to perform verification tasks in an automated way by building data models of the signatures and using classifiers to segregate them into pre-defined classes.

The main challenges of a signature recognition system are two-fold: the first arises from the inherent variability of the signature itself. A signature does not necessarily comprise of a set of well-formed characters or alphabets, rather it is generally a combination of a set of lines and curves, intersecting or separate. Building a reliable data model of such an entity is difficult by itself. Moreover, two signatures are never identical as signatures written by the same person are bound to differ to some extent. Apart from the shape of the written signature, the size and orientation of the signatures are also liable to change. Given so much variability, it seems almost impossible to build a reliable recognition system, especially when it is also possible that others might be trying to purposely forge a valid signature. The second challenge arises from the variability introduced by an automated recognition system. These arise from the noises introduced by a data acquisition system, variation in lighting conditions and sensor characteristics, transformations like translation, rotation and scale of a scanned signature. An effective signature recognition system should be able to address both types of challenges. Such a system should also be efficient as most recognition tasks are expected in real time and should be done without much delay.

The rest of the paper is organized as follows: Sect. 2 includes a study of existing approaches, Sect. 3 outlines the proposed approach, Sect. 4 tabulates experimental results, Sect. 5 consists of an analysis of the current work side by side other works, and Sect. 6 brings up the conclusion and future scopes.

2 Literature Survey

A number of different approaches for signature recognition and classification have been proposed in existing literature. In the grid-based [1] the binary signature image was divided into horizontally and vertically 8 × 8 total 64 grids and calculate the features in four directions right to left, left to right, top to bottom and bottom to top respectively for each grid. The numbers of the white pixels are counted from the first white pixel to first black pixel and the number of the white pixel present between two black pixels present in each grid. The features were classified by using k-NN classifier and achieved 56% accuracy. A drawback of this approach is that the signature needs to be perfectly aligned to create horizontal and vertical grids; the technique fails for a rotated signature. A local binary pattern has been used in approach [2, 7, 11]. The center pixel works over a radius of 4.256 and generates a 64-bin histogram for each signature image and 10 nearest neighbors on the k-NN algorithm to classify the signatures. This method achieves an overall 70% accuracy. In paper [11] a local derivative pattern has been used. In paper [3] a grid feature [4, 9] and centroid feature are used. The system worked on 36 signatures of one writer and some forgery signatures. Then the signature image was divided into 10 × 10 grids in vertical as well as horizontal manner and then determines which grid contained more than 3 black pixels to save it as a feature. After generating the grid feature it segmented the image in regular 3 parts and calculates the distance between centroids of these three parts. The multiple global features, grid feature and SIFT feature have been used in a combined way in a different system [6]. Though the system used so many features to classify a signature still it has failed to generate a satisfactory accuracy and produced 88% overall accuracy. The system could not perform over signature position’s variations. The pixel value defers when it is rotated or scaled in a different manner. To overcome this problem the introduced approach used the features which will not depend on the position of signature.

3 Proposed Approach

Most of the existing works on signature recognition focus on modeling the signature shapes using a variety of features, but these techniques usually do not work satisfactorily when the signatures are transformed using translation, rotation, and scaling. This paper utilizes a set of transformation invariant features to compensate for such transformation based variations. The system design in Fig. 1 outlines the procedure.
Fig. 1.

System design of the proposed method

3.1 Image Pre-processing

This pre-processing step helps to improve scanned image’s quality before features are extracted and to compensate for various geometrical transformations. There are lots of pre-processing steps i.e. Binarization, Compensating for rotation, Compensating for translation, and Noise Removal.

3.1.1 Binarization

The RGB image of the signature is converted to grayscale image \( \left( {I_{G} } \right) \) using the Eq. (1). In the second step, it is binarized using Otsu’s algorithm [11]. The binary image is complemented to make the background black as shown in Fig. 2(d).
Fig. 2.

(a) RGB image (b) Grayscale image (c) Binary image (d) Complemented binary image.

$$ I_{G} = 0.2989 \times R + 0.5870 \times G + 0.1140 \times B $$
(1)

3.1.2 Compensating for Rotation

In most cases, the signature is not exactly horizontal but inclined at a certain angle. What is more challenging is that the inclination angle would vary for the same person. For reliable recognition the system needs to compensate for the arbitrary inclination through a normalization step. In this work the bounding box of the signature is first determined by identifying the topmost (T), leftmost (L), bottommost (B) and rightmost (R) white pixels from the binary image. The width (W) and height (H) of the signature are now calculated:
$$ W = R - L, H = B - T $$
(2)
The length of the diagonal (D) and the angle (A) by which it is inclined to the horizontal are next determined:
$$ D = \sqrt {W^{2} + H^{2} } $$
(3)
$$ A = sin ^{ - 1} \left( {\frac{H}{D}} \right) $$
(4)
Assuming the signature would be oriented along the diagonal of the bounding box, it is then rotated by the angle A to make it horizontal. See Fig. 3. For the image shown the rotation angle would be positive i.e. counter-clockwise (+A). On the other hand, if the signature was oriented along the other diagonal of the bounding box, then the rotation angle would be negative i.e. clockwise (A). To determine which diagonal the signature is oriented along, a simple check is done to see if the leftmost point (indicated by green mark) of the signature is above or below the rightmost point (indicated by yellow mark). If the leftmost point is above the rightmost point (as shown in Fig. 3(a)) and one-fourth of the height (H) of the signature is above the leftmost point then the rotation angle is +A, otherwise, it will remain the same as the input image. If the leftmost point is below the rightmost point the signature would be oriented along the other diagonal and rotation angle would need to be –A. If the height (H) of the signature is greater than the width (W) of the signature then the rotation angle is +90 as shown in Fig. 3(c).
Fig. 3.

(a), (c) Signatures before rotation (b), (d) Signatures after rotation (Color figure online)

3.1.3 Compensating for Translation

If the signature has a translation offset within the image, then there would be un-necessary borders around the signature which needs to be discarded, and the signature to be cropped as shown in Fig. 4.
Fig. 4.

Cropped image

Cropping is done by calculating the boundaries [11] of the bounding box as done previously and keeping only the portion inside the box discarding all portions external to the box.

3.1.4 Noise Removal

In noise removal step, a filtered image is being generated by deleting all the extra pixels which are not part of the actual signature. It can be a spot of ink. To remove this noise the machine checks the generated area value of a signature and filter the image by using a minimum area value as a threshold which is fixed for each image.

The red boxes are denoting each area before removing the noise. In Fig. 5(a) the smallest red box points out one noise in this signature image and including it, total number of area boundary is 8. After applying the threshold in area value which is fixed for each image Fig. 5(b) is produced as a filtered image of the signature and the number of area boundary is 7.
Fig. 5.

(a) The image with noise (b) The filtered image (Color figure online)

3.2 Feature Extraction

Each person has a distinct signature style. To analyze the signatures correctly after placed it in a proper way, the following signature based features are estimated from the cropped image like Aspect Ratio (AR), Number of Regions (NR), Gradient Magnitude (GM) and Corner Points (CP).

3.2.1 Aspect Ratio

The size of the signature may vary in time to time but the aspect ratio of the signature must be more or less same because if the height of the signature increases or decreases then the width will also increase or decrease for uniform scaling. The system extracts the four boundary pixel values to determine the aspect ratio of the signature. The four boundary pixels are Top(x, y), Bottom(x, y), Right(x, y) and Left(x, y). The aspect ratio (AR) [4, 5, 8] can be measured by using Eq. (7).
$$ Width = \left( {Right\_x - Left\_x} \right) $$
(5)
$$ Height = \left( {Bottom\_y - Top\_y} \right) $$
(6)
$$ AR = Height/Width $$
(7)

3.2.2 Number of Regions

Every person has a specific signature style. All the letters, lines and curves of the signature may not be connected always. The signature may contain more than one disconnected components. The number of disconnected components varies with the person. In the pre-processed version of signature, these disconnected components appear as distinct binary regions. These regions are not containing a similar number of pixels. The \( NR \) feature depicts the number of distinct regions. Refer to Eq. (8).
$$ NR = \,Total\,number\,of\,binary\,regions $$
(8)
The red boxes in Fig. 6 are denoting each disconnected elements where the total number of the regions is 7.
Fig. 6.

Showing disconnected components of a signature (Color figure online)

3.2.3 Gradient Magnitude

Gradient magnitude generally uses for texture matching. Gradient magnitude returned as a non-sparse matrix of the same size as input image as shown in Eq. (9), where \( \frac{\partial f}{{\begin{array}{*{20}c} {\partial x} \\ \\ \end{array} }} \) is the gradient in x direction and \( \frac{\partial f}{{\begin{array}{*{20}c} {\partial y} \\ \\ \end{array} }} \) is the gradient in y direction.
$$ \nabla f = \left[ {\begin{array}{*{20}c} {g_{x} } \\ {g_{y} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\frac{\partial f}{{\begin{array}{*{20}c} {\partial x} \\ \\ \end{array} }}} \\ {\frac{\partial f}{\partial y}} \\ \end{array} } \right] $$
(9)
$$ G_{xy} = \sqrt {g_{y}^{2} - g_{x}^{2} } $$
(10)
Here the input images are pre-processed signature images. The resulting gradient magnitude matrix is processed by combining magnitudes in both x and y direction. Refer to Eq. (10). Gradient magnitude matrix can’t be used as a classification feature. The mean value of this matrix used as a classification feature. Refer to Eq. (11). Figure 7 shows the gradient of a signature image.
Fig. 7.

Gradient magnitude of a signature

$$ GM = Mean \left( {G_{xy} } \right) $$
(11)

3.2.4 Corner Points

Corners of an image contain important features. The corner points [5, 8] of signatures are detected by using Harris corner detection method. It returns the coordinate values of the corner points. Here, total number of corner point used as classification feature CP. Figure 8 shows the corner points of a signature as blue markers.
Fig. 8.

Corner points of a signature (Color figure online)

$$ CP = \,Total\,number\,of\,Corner\,points\,in\,the\,Signature $$
(12)

3.2.5 Feature Vector

It is to be observed that each of the features does not depend on rotation, scaling and translation. The final feature vector (FV) is formed with the geometrical transformation independent features i.e. \( AR, NR,GM, \,and CP \). Refer to Eq. (13).
$$ {\text{FV }} = \left\{ { AR, NR,GM,CP} \right\} $$
(13)
Figure 9 shows the scatter plot of features. 10 colors represent train samples of 10 distinct classes. This plot shows how the features are discriminating different classes of train data set.
Fig. 9.

Feature plot for different classes of train data set (Color figure online)

3.3 Classification

Each class i contains n number of train images and m number of test images. Here Euclidean distance (ED) classifier, k-NN classifier and Bayes classifier have been used. The signatures of individual writers are classified based on feature vector FV. Euclidean Distance (d) of two n-dimensional vectors, p = {p1, p2, …, pn} and q = {q1, q2, …., qn} is calculated using Eq. (14).
$$ \begin{aligned} d\left( {p,q} \right) = & \sqrt {(p_{1} - q_{1} )^{2} + (p_{2} - q_{2} )^{2} \ldots + (p_{n} - q_{n} )^{2} } \\ = & \sqrt {\mathop \sum \limits_{i = 1}^{n} \left( {p_{i} - q_{i} } \right)^{2} } \\ \end{aligned} $$
(14)
Bayes classifier uses probability based on strong assumptions on features. It is able to classify a given sample x based on its n features but when the features contain a large amount of values the probabilistic classification becomes infeasible. Then it solves the problem using Eq. (15).
$$ P\,\left( {Ck\,|\,x} \right) = P\left( {Ck} \right).\,\left( {Ck} \right)/P\left( x \right) $$
(15)

Where, Posterior denotes P(Ck | x), Prior named as P(Ck), Likelihood named as P(x | Ck) and Evidence state as P(x).

k-Nearest Neighbours algorithm is used for classification in pattern recognition system. A signature is classified by its nearest neighbours and based on the similarity amongst them. This algorithm checks the features of a test signature and compares the same with the training samples. The class of the training sample, which is to the closest to the test signature, is selected as the class of test signature. Finally, the overall system accuracy is calculated using Eq. 16.
$$ Accuracy = \left( {No.\,of\,correctly\,classified\,Sample} \right)/\left( {Total\,no.\,of\,Sample} \right) $$
(16)

4 Experimental Results

The proposed approach is for offline signature [10] where the signatures to be collected using pen and paper. In this work, the system used total 100 signatures from 10 different persons. This dataset comprises 10 signatures from each individual, using blue or black ink on an A4 size page within a fixed size box where each page contains 10 signatures. The images have been scanned by using scanner application of smartphone. Each class contains total 10 signatures and they equally divided for train and test data. The signatures with different angle, different sizes have been used to determine the effectiveness of this approach. Figure 10 shows one sample of each class from the training dataset. Figure 11 shows one sample of each class from the testing dataset.
Fig. 10.

Training samples

Fig. 11.

Testing samples

All four features are tested individually to see the classification ability of each feature. Finally, those four features are combined to get an improved accuracy. Table 1 depicts accuracy of individual features as well as combined features using k-NN classifier.
Table 1.

Classification results of feature(s)

Serial no.

Feature(s)

Overall accuracy % (k-NN)

1.

AR

56

2.

NR

44

3.

GM

44

4.

CP

34

5.

{AR, NR, GM, CP}

92

Table 2 shows overall classification accuracy of proposed feature vector using three different classifiers and in order to cross-validate the results to remove any bias for selecting the training and testing datasets, they have been exchanged and accuracies are once again calculated as given in Table 2.
Table 2.

Classification result of proposed approach using different classifier

Classifier

Overall accuracy %

Overall accuracy % after exchanging data set

ED classifier

77

76

Bayes classifier

82

78

k-NN classifier

92

89

Figure 12 shows the distribution plot of each test sample based on their Euclidean Distance from each training sample of each class. The least distance value of each test sample determines the class of individuals. There are 10 classes and 50 test samples of 10 different color symbols to depict the result. The x-axis represents the classes and the y-axis represents the Euclidean Distance value of each sample. The red lines are samples of class 1. The green lines are samples of class 2 where two green samples are misclassified. The blue lines are samples of class 3. The cyan lines are samples of class 4 where two samples are misclassified. The magenta lines are samples of class 5. The yellow lines are samples of class 6. The black lines are samples of class 7. The red lines with the cross symbol are samples of class 8. The blue lines with the cross symbol are samples of class 9. The black lines with the cross symbol are samples of class 10.
Fig. 12.

Classification plot of each test sample (Color figure online)

5 Analysis

Table 3 shows the accuracy of two previous approaches on the same dataset using different classifier i.e. Euclidean Distance (ED) Classifier, Bayes Classifier, k-Nearest Neighbour (k-NN) classifier.
Table 3.

Classification result of the previous approaches on the same dataset

Approach

Technique

ED classifier

Bayes classifier

k-NN classifier

Ubul et al.

Grid based

56

60

60

Ilmi et al.

LBP based

60

60

70

Figure 13 shows class wise accuracy comparison of the proposed approach and previous approaches. Redline depicts that proposed approach performs better than the previous approaches for most of the classes.
Fig. 13.

Class wise accuracy comparison of the proposed approach and previous approaches (Color figure online)

Figure 14 shows the accuracy of different approach using different classifier. k-NN classifier gives the better result than other classifiers for all the approach. Combination of k-NN and proposed feature vector gives the best result. The k-NN classification algorithm relies on the distance between feature vectors. It performs on labeled images to predict correctly and returns the actual category for the image. This is the reason for selection k-NN. The previous methods by Ubul et al., Ilmi et al. have worked on pixel values which vary by position of the signature and their methods failed for rotated signatures where this proposed method has worked on signatures with different angle and size and used the translation, rotation and scaling independent features.
Fig. 14.

Comparison of overall accuracy % using different classifier and different approaches

There are some systems where they used so many features to recognize and classify a signature which increases the computational time [6] but proposed system completes classification task within a fraction of a second.

6 Conclusion and Future Scope

Signature classification should be accurate because signature verification is a requirement for security purpose in daily life like in a bank, official purpose, forensic lab etc. In this paper, a transformation invariant method for signature recognition has been proposed. The input signature image is converted to a binary image. This binary image is pre-processed in different way to compensate translation, rotation and noise removal. The pre-processed image is used for feature extraction. Extracted features are Aspect ratio (AR), Number of regions (NR), Mean of gradient magnitude (GM) and Number of corner points (CP). These features are combined and used for classification of 10 signature classes. k-NN classifier gives the best result comparative to other classifiers used for experimentation. Overall classification accuracy is 92% using k-NN classifier.

The contributions of this paper are (a) prepared a ground truth handwritten signature dataset of 100 images (b) the signature classification technique is translation, rotation and uniform scaling independent. Extending the proposed technique to include images which are scaled non-uniformly and rotated by 180° may provide scopes for future research work.

References

  1. 1.
    Ubul, K., Adler, A., Abliz, G., Yasheng, M., Hamdulla, A.: Off-line Uyghur signature recognition based on modified grid information features. In: 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA). IEEE (2012)Google Scholar
  2. 2.
    Ilmi, N., Budi, W.T.A., Nur, R.K.: Handwriting digit recognition using local binary pattern variance and K-Nearest neighbor classification. In: 4th International Conference on Information and Communication Technology (ICoICT). IEEE (2016)Google Scholar
  3. 3.
    Roy, S., Maheshkar, S.: Offline signature verification using grid based and centroid based approach. Int. J. Comput. Appl. 86 (2014)CrossRefGoogle Scholar
  4. 4.
    Rathi, A., Rathi, D., Astya, A.: Offline handwritten signature verification by using pixel based method. Int. J. Eng. Res. Technol. (IJERT) 1 (2012)Google Scholar
  5. 5.
    Jana, R., Saha, R., Dutta, D.: Offline signature verification using Euclidian distance. Int. J. Comput. Sci. Inf. Technol. 5, 707–710 (2014)Google Scholar
  6. 6.
    Marušić, T., Marušić, Ž., Šeremet, Ž.: Identification of authors of documents based on offline signature recognition. In: MIPRO, pp. 1144–1149, May 2015Google Scholar
  7. 7.
    Hiremath, G.: Verification of offline signature using local binary and directional pattern. Int. J. Innovative Sci. Eng. Technol. (IJISET) 3 (2016)Google Scholar
  8. 8.
    Panchal, T., Patel, H., Panchal, A.: License plate detection using Harris corner and character segmentation by integrated approach from an image. In: 7th International Conference on Communication, Computing and Virtualization, pp. 419–425 (2016)CrossRefGoogle Scholar
  9. 9.
    Bisen, R., Mishra, A.: Offline signature verification with random and skilled forgery detection, using grid based feature extraction. Int. J. Electron. Electr. Comput. Syst. 5 (2016)Google Scholar
  10. 10.
    Yogesh, G., Patil, A.: Offline and Online Signature Verification Systems: A Survey. Int. J. Res. Eng. Technol. 3, 328–332 (2014)CrossRefGoogle Scholar
  11. 11.
    Kanetkar, S., Pathania, A., Venugopal, V., Sundaram, S.: Offline writer identification using local derivative pattern. In: 15th International Conference on Frontiers in Handwriting Recognition. IEEE (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Chandrima Ganguly
    • 1
    Email author
  • Susovan Jana
    • 2
  • Ranjan Parekh
    • 1
  1. 1.School of Education TechnologyJadavpur UniversityKolkataIndia
  2. 2.Department of Production EngineeringJadavpur UniversityKolkataIndia

Personalised recommendations