Differentiating Pen Inks in Handwritten Bank Cheques Using Multi-layer Perceptron

Dansena, Prabhat; Bag, Soumen; Pal, Rajarshi

doi:10.1007/978-3-319-69900-4_83

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10597))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

3647 Accesses
18 Citations

Abstract

In handwritten Bank cheques, addition of new words using similar color pen can cause huge loss. Hence, it is important to differentiate pen ink used in these types of documents. In this work, we propose a non-destructive pen ink differentiation method using statistical features of ink and multi-layer perceptron (MLP) classifier. Large sample of blue and black pen ink is acquired from 112 Bank cheque leaves, written by nine different volunteers using fourteen different blue and black pens. Handwritten words are extracted from scanned cheque images manually. Pen ink pixels are identified using K-means binarization. Fifteen statistical features from each color handwritten words are extracted and are used to formulate the problem as a binary classification problem. MLP classifier is used to train the model for differentiating pen ink in handwritten Bank cheques. The proposed method performs efficiently on both known and unknown pen samples with an average accuracy of 94.6% and 93.5% respectively. We have compared the proposed method with other existing method to show its efficiency.

You have full access to this open access chapter, Download conference paper PDF

Pen ink discrimination in handwritten documents using statistical and motif texture analysis : A classification based approach

Article 07 April 2022

Identification of Fraudulent Alteration by Similar Pen Ink in Handwritten Bank Cheque

Classification of Kannada Numerals Using Multi-layer Neural Network

Keywords

1 Introduction

Forensics in handwritten Bank cheques from the perspective of differentiating pen ink have great importance to the judicial system. In handwritten Bank cheque forensics, it is often important to establish a relation between the pen inks. It helps to identify whether a single pen has been used to write the Bank cheque or multiple pens. Numerous possibilities of fraud exist in handwritten Bank cheques. In this work, we focus only on pen ink differentiation in Bank cheques. Possibilities of fraud in any Bank cheque and its consequences helps to understand the importance of the work.

Example of new words addition in Bank cheque using a different pen is depicted in Fig. 1 which is elaborated as follows. The cheque was initially issued to Mr. Ravi Kumar Singh, amounting to Seventy thousand only. Later, forger appended new words in pay name and amount section as marked by red circles in Fig. 1. This difference in pen ink can not be always perceived by naked eye. This type of case helps us to understand the possibility of addition of new words in handwritten Bank cheques. A number of handwritten document frauds are possible in bill, business agreement, educational documents, etc. This motivates us to differentiate pen ink in Bank cheques.

Pen ink analysis techniques can be categorized in two major pathways: destructive and non-destructive techniques. Merrill and Bartick [1] have used infrared spectrum to differentiate pen ink. Taylor [2] has proposed a method to analyze intersecting lines using stereo microscope, distilled water, and wax lift techniques. Taylor [3] has also proposed TLC plate analysis method using solvent and micro-dispenser for pen ink classification. The second category of technique includes non-destructive techniques, which include modern chromatographic, image processing, and pattern recognition techniques. Khan et al. [4] have used spectral response and K-means clustering algorithm for pen ink difference identification. Khan et al. [5] have also used Principal Component Analysis for spectral response feature reduction. Then K-means clustering has been done to differentiate pen ink. Dasari and Bhagvati [6] have proposed statistical features of ink pixels from HSV color channel and distance measure based classification is performed. Kumar et al. [7, 8] have shared statistical features as gray-level co-occurrence (GLCM), geometric, and legendre moments from $YC_{b}C_{r}$ and opponent color models. In these methods, nearest neighbor and Support Vector Machine with feature selection have been used as classifiers to differentiate pen ink. Gorai et al. [9] have extracted twelve feature images from color input image and corresponding gray version using local binary pattern and Gabor filters. In this method, histograms of pen ink pixels from feature images are calculated and histogram matching has been performed to identify the ink mismatch.

It is observed that most of the works in the area of non-destructive ink analysis ranges from hyper-spectral and microscopic imaging to chromatographic technique. This requires high configuration hardwares those are too costly as well as rarely available in market. In this paper, we have proposed a method that is capable of differentiating pen ink using simple standard scanning devices. Such devices are easily available and at the same time cost effective. In this method, pen ink samples are extracted manually from scanned Bank cheque leaves. $K-$means binarization has been used to identify ink pixels from each color channel of word images. Statistical features of ink pixels are extracted from each channel. Extracted feature set is used to train the MLP classifier for pen ink difference identification.

The rest of the paper is organized as follows. Section 2 discusses the proposed methodology for pen ink differentiation in handwritten Bank cheques. Experimental results and relevant discussion are presented in Sect. 3. The concluding remarks are given in Sect. 4.

2 Proposed Model

In this proposed method, pair of words have been analyzed to detect whether they have been written by same pen or not. Pen ink differentiation problem is formulated into a binary classification problem where two different pens are used to write on a particular Bank cheque. If two different pens are used to write word-pairs in a same Bank cheque, then it is labeled as class-I; otherwise it is labeled as class-II. The system architecture of the proposed method is depicted in Fig. 2.

2.1 K-means Algorithm Based Foreground Pixel Identification

Pen ink pixels (PI) identification is an important task in handwritten Bank cheque for differentiating pen ink. We have used K-means algorithm to binarize the word images for this purpose. Basic idea behind K-means is to minimize the objective function (i.e., inter cluster Euclidean distance), where K is an user defined parameter. In our experiment, we have chosen K = 2 to identify PI as foreground pixels. Color handwritten word image extracted from Fig. 1 is taken as input (Fig. 3a) and corresponding gray image is obtained. Gray version of input is used to identify the PI in color handwritten word image. K-means binarization partitions n gray values into K clusters, which separates the foreground from the background. This binarization method is used to identify PI as foreground pixels as depicted in Fig. 3b. This method works well for ink pixels identification because foreground and background intensity profiles are not overlapping in handwritten word images.

2.2 Extraction of Statistical Features from Ink Pixels

Once coordinates of ink pixels (i, j) are identified using K-means binarization, following five statistical features are extracted from each color channel of ink pixels.

(a) Mean:- The Mean ($\bar{m}$) for ink pixels is defined by

$$\begin{aligned} \begin{array}{rcl} \bar{m}= \frac{m_{xy}}{N} \text {, where} \end{array} \end{aligned}$$

(1)

$$\begin{aligned} \begin{array}{rcl} m_{xy}= \sum \nolimits _{j=0}^{y} \sum \nolimits _{i=0}^{x} w_{k}(i,j) \mid (i,j)\epsilon PI \end{array} \end{aligned}$$

(2)

$$\begin{aligned} \begin{array}{rcl} N=\sum \nolimits _{j=0}^{y} \sum \nolimits _{i=0}^{x}1 \mid (i,j)\epsilon PI \end{array} \end{aligned}$$

(3)

(b) Variance:- The Variance (Var) for ink pixels is defined by

$$\begin{aligned} \begin{array}{rcl} {Var}=\frac{1}{N-1} \sum \nolimits _{j=0}^{y} \sum \nolimits _{i=0}^{x}\left[ w_{k}(i,j)-\bar{m} \right] ^{2} \mid (i,j)\epsilon PI \end{array} \end{aligned}$$

(4)

(c) Skewness:- The Skewness (Skew) for ink pixels is defined by

$$\begin{aligned} \begin{array}{rcl} {Skew}=\frac{1}{N} \sum \nolimits _{j=0}^{y} \sum \nolimits _{i=0}^{x}\left[ \frac{w_{k}(i,j)-\bar{m} }{\sqrt{Var}} \right] ^{3} \mid (i,j)\epsilon PI \end{array} \end{aligned}$$

(5)

(d) Kurtosis:- The Kurtosis (Kurt) for ink pixels is defined by

$$\begin{aligned} \begin{array}{rcl} {Kurt}=\left\{ \frac{1}{N} \sum \nolimits _{j=0}^{y} \sum \nolimits _{i=0}^{x}\left[ \frac{w_{k}(i,j)-\bar{m} }{\sqrt{Var}} \right] ^{4}-3 \right\} \mid (i,j)\epsilon PI\ \end{array} \end{aligned}$$

(6)

(e) Mean Absolute Deviation:- The Mean Absolute Deviation (MAD) for ink pixels is defined by

$$\begin{aligned} \begin{array}{rcl} {MAD}=\frac{1}{N} \sum \nolimits _{j=0}^{y} \sum \nolimits _{i=0}^{x}\left| {w_{k}(i,j)-\bar{m} }\right| \mid (i,j)\epsilon PI \end{array} \end{aligned}$$

(7)

Where N is total number of foreground pixels, defined by the Eq. 3. Foreground pixels of handwritten word (w(i, j)) is defined by PI using K-means binarization. Handwritten word from each color channel R, G, and B are denoted by $w_{k}(i,j)$, where $k= \{R,G,B\}$ and (i, j) is coordinates of ink pixels.

2.3 Differentiation of Pen Ink Using MLP Classifier

MLP classifier is used for differentiating pen inks. MLP architecture with input layer, output layer, and one hidden layer with seventeen computational nodes is considered for our experimental purpose. Sigmoid activation function is used in our MLP architecture. Features from two words under consideration are fed into the MLP network to identify whether same pen has been used or not. This MLP architecture is trained with 5000 iterations at learning rate $\alpha $ = 0.2. Post training MLP architecture is used for classification of known and unknown pen samples.

3 Experimental Results and Discussion

3.1 Data Set Acquisition

Data is extracted from the IDRBT Cheque Image Dataset [10] with diverse texture and ink color. Total 112 cheque leaves from four different Indian Banks are used as source document. In order to simulate the pen ink difference in cheque leaves, seven blue and seven black pens are used. To avoid biasness due to writing, nine different volunteers have taken active participation to prepare data set. A total of 14 $\times $ 9 = 126 pen−volunteer combinations (fourteen pens and nine volunteers) are used for pen ink data generation. In practical scenario, similar color pens are used for addition of new words in source document. Each cheque is written by two volunteers using two different pens (either blue or black). Hence, data set is created with 2 $\times $ $7_{C_2}$ = 42 possible combinations of blue and black pens. All the cheque leaves are scanned in normal scanner at 300 dpi resolution. Handwritten words from each scanned cheque are cropped manually and grouped based on pen used to write the words.

Table 1. Proposed method accuracy for known and unknown pen.

Full size table

3.2 Experimental Set-up

In each cheque, two pens $P_{i}$ and $P_{j}$ are used for writing m and n number of different words respectively. Set $W_{p_{i}}$ and $W_{p_{j}}$ contains words written by $P_{i}$ and $P_{j}$ respectively, where $W_{P_{i}}$ = {${m_{1}}$, ${m_{2}}$,. . ., ${m_{m}}$} and $W_{P_{j}}$ = {${n_{1}}$, ${n_{2}}$,. . ., ${n_{n}}$}. The word pairs written by different and same pens are considered in case-I and case-II respectively.

Case-I: Two different pens are used to write the word pairs. The Cartesian product of $W_{P_{i}}$ $\times $ $W_{P_{j}}$ + $W_{P_{j}}$ $\times $ $W_{P_{i}}$ includes the total number of word-pairs written using different pens, where $W_{P_{i}}$ $\times $ $W_{P_{j}}$ = {($m_{i}$, $n_{j}$) $\mid $ $m_{i}$ $\epsilon $ $W_{P_{i}}$ $\wedge $ $n_{j}$ $\epsilon $ $W_{P_{j}}$} and $W_{P_{j}}$ $\times $ $W_{P_{i}}$ = {($n_{j}$, $m_{i}$) $\mid $ $n_{j}$ $\epsilon $ $W_{P_{j}}$ $\wedge $ $m_{i}$ $\epsilon $ $W_{P_{i}}$}. Thus, total number of word-pairs for class-I will be 2 $\times $ (m $\times $ n).

Case-II: Same pen is used to write the word pairs. The Cartesian product of $W_{P_{i}}$ $\times $ $W_{P_{i}}$ + $W_{P_{j}}$ $\times $ $W_{P_{j}}$ includes the total number of word-pairs written using same pen, where $W_{P_{i}}$ $\times $ $W_{P_{i}}$ = {($m_{i}$, $m_{i}$) $\mid $ $m_{i}$ $\epsilon $ $W_{P_{i}}$} and $W_{P_{j}}$ $\times $ $W_{P_{j}}$ = {($n_{j}$, $n_{j}$) $\mid $ $n_{j}$ $\epsilon $ $W_{P_{j}}$}. Thus, total number of word-pairs for class-II will be {(m $\times $ m)-m} + {(n $\times $ n)-n}, after excluding the pairs of word with itself. For each cheque, total instances of class-I and class-II are calculated and stored. The number of word pairs for case-I and case-II in Fig. 1 can be calculated as follows. Set of words written using pens $P_{1}$ and $P_{2}$ are $W_{P_{1}}$ = {J, Two, lakh} and $W_{P_{2}}$ = {Ravi, Kumar, Singh, Seventy, thousand} respectively. The total number of word pairs for case-I are $(3 \times 5) + (5 \times 3) = 30$. The number of word pairs belongs to the case-II are $(\{3 \times 3)-3\} + \{ (5 \times 5)-5\} = 26$. Thus, total instances including class-I (30) and class-II (26) are $30 + 26 = 56$.

To simulate pen ink difference identification, seven blue and seven black pens are used on Bank cheques. Each instance has thirty features and a class value. For each instance, 2 $\times $ 15 = 30 features are extracted from each handwritten word pair under consideration. The whole data set is divided into three subsets, namely training, validation, and test set using leave-k-out method. K = 2 is used to keep two unknown pen samples out for testing and performance evaluation of MLP classifier. Keeping two pens out, total possibilities are $2$ $\times $ $7_{C_2}$ = 42 for both blue and black pen samples. Remaining data set after excluding the test subset is partitioned into ten approximately equal parts. One of ten data parts is kept as validation set remaining partitions are used as training set. The process of selecting validation set is repeated ten times, with each one of the ten data parts exactly once. Training set is used to train the MLP model inter and intra class difference. Validation is performed to check MLP classifier performance on known pen ink samples. Model testing is performed on the test set to check the performance of the MLP model on unknown pen ink samples.

3.3 Experimental Results and Comparison

We evaluate the performance of the binary classification problem for differentiating pen ink in handwritten Bank cheque. Both blue and black pen average accuracy of MLP classifier is presented in Table 1 for known and unknown pen samples, where $P_{1}$–$P_{7}$ and $P_{8}$–$P_{14}$ are black and blue pens respectively. To show the efficiency of the proposed work, result analysis is performed using leave 2 pen out method. The average accuracy on both blue and black pen of MLP classifier is 94.60% and 93.50% for known and unknown pen samples respectively.

Table 2. Comparison in between proposed and existing method.

Full size table

We have compared our result with Gorai et al. [9], which introduced technique for ink analysis and difference identification using simple scanning devices. Moreover, this method [9] did not take biasness due to writer into consideration. Our proposed method has taken this issue into consideration and provides better results than the previous one. A comparative analysis of proposed method with method in [9] is presented in Table 2.

4 Conclusion

In this paper, we have proposed pen ink difference identification method in handwritten Bank cheques. Differentiation of pen ink problem is formulated as a binary classification problem. Thirty features for each instance of word pair are extracted. These extracted features are used to train the MLP classifier on known pen ink pixels. Performance of MLP classifier is evaluated on both known and unknown pen ink pixels. Result analysis and comparison shows the superiority of the proposes method over the existing method on both black and blue pen samples.

References

Merrill, R.A., Bartick, E.G.: Analysis of ball pen inks by diffuse reflectance infrared spectrometry. J. Forensic Sci. 29(1), 92–98 (1992)
Google Scholar
Taylor, L.R.: Intersecting lines as a means of fraud detection. J. Forensic Sci. 37(2), 528–541 (1984)
Google Scholar
Taylor, L.R.: Developments in the analysis of writing inks on questioned documents. J. Forensic Sci. 37(2), 612–619 (1992)
Google Scholar
Khan, Z., Shafait, F., Mian, A.: Hyperspectral imaging for ink mismatch detection. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 877–881 (2013)
Google Scholar
Khan, Z., Shafait, F., Mian, A.: Automatic ink mismatch detection for forensic document analysis. Pattern Recogn. 48(11), 3615–3626 (2015)
Article Google Scholar
Dasari, H., Bhagvati, C.: Identification of non-black inks using HSV color spaces. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 486–490 (2007)
Google Scholar
Kumar, R., Pal, N.R., Sharma, J.D., Chanda, B.: A novel approach for detection of alteration in ball pen writings. In: Proceedings of International Conference on Pattern Recognition and Machine Intelligence, pp. 400–405 (2009)
Google Scholar
Kumar, R., Pal, N.R., Sharma, J.D., Chanda, B.: Forensic detection of fraudulent alteration in ball-point pen strokes. IEEE Trans. Inf. Forensics Secur. 7(2), 809–820 (2012)
Article Google Scholar
Gorai, A., Pal, R., Gupta, P.: Document fraud detection by ink analysis using texture features and histogram matching. In: International Joint Conference on Neural Networks, pp. 4512–4517 (2016)
Google Scholar
IDRBT Cheque Image Dataset: http://www.idrbt.ac.in/icid.html

Download references

Acknowledgment

A part of this work is sponsored by the project “Design and Implementation of Multiple Strategies to Identify Handwritten Forgery Activities in Legal Documents” (No. ECR/2016/001251, Dt.16.03.2017), SERB, Govt. of India.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology (Indian School of Mines), Dhanbad, Jharkhand, India
Prabhat Dansena & Soumen Bag
Institute for Development and Research in Banking Technology, Hyderabad, India
Rajarshi Pal

Authors

Prabhat Dansena
View author publications
You can also search for this author in PubMed Google Scholar
Soumen Bag
View author publications
You can also search for this author in PubMed Google Scholar
Rajarshi Pal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Prabhat Dansena or Soumen Bag .

Editor information

Editors and Affiliations

Indian Statistical Institute, Kolkata, India
B. Uma Shankar
Indian Statistical Institute, Kolkata, India
Kuntal Ghosh
Indian Statistical Institute, Kolkata, India
Deba Prasad Mandal
Indian Statistical Institute, Kolkata, India
Shubhra Sankar Ray
The Hong Kong Polytechnic University, Hong Kong, China
David Zhang
Indian Statistical Institute, Kolkata, India
Sankar K. Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dansena, P., Bag, S., Pal, R. (2017). Differentiating Pen Inks in Handwritten Bank Cheques Using Multi-layer Perceptron. In: Shankar, B., Ghosh, K., Mandal, D., Ray, S., Zhang, D., Pal, S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2017. Lecture Notes in Computer Science(), vol 10597. Springer, Cham. https://doi.org/10.1007/978-3-319-69900-4_83

Download citation

DOI: https://doi.org/10.1007/978-3-319-69900-4_83
Published: 01 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69899-1
Online ISBN: 978-3-319-69900-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Differentiating Pen Inks in Handwritten Bank Cheques Using Multi-layer Perceptron

Abstract

Similar content being viewed by others

Pen ink discrimination in handwritten documents using statistical and motif texture analysis : A classification based approach

Identification of Fraudulent Alteration by Similar Pen Ink in Handwritten Bank Cheque

Classification of Kannada Numerals Using Multi-layer Neural Network

Keywords

1 Introduction

2 Proposed Model

2.1 K-means Algorithm Based Foreground Pixel Identification

2.2 Extraction of Statistical Features from Ink Pixels

2.3 Differentiation of Pen Ink Using MLP Classifier

3 Experimental Results and Discussion

3.1 Data Set Acquisition

3.2 Experimental Set-up

3.3 Experimental Results and Comparison

4 Conclusion

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Differentiating Pen Inks in Handwritten Bank Cheques Using Multi-layer Perceptron

Abstract

Similar content being viewed by others

Pen ink discrimination in handwritten documents using statistical and motif texture analysis : A classification based approach

Identification of Fraudulent Alteration by Similar Pen Ink in Handwritten Bank Cheque

Classification of Kannada Numerals Using Multi-layer Neural Network

Keywords

1 Introduction

2 Proposed Model

2.1 K-means Algorithm Based Foreground Pixel Identification

2.2 Extraction of Statistical Features from Ink Pixels

2.3 Differentiation of Pen Ink Using MLP Classifier

3 Experimental Results and Discussion

3.1 Data Set Acquisition

3.2 Experimental Set-up

3.3 Experimental Results and Comparison

4 Conclusion

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation