Abstract
Braille books follow a process of deterioration different from ordinary printed books. Braille is a reading letter and visually impaired people touch the boss with the finger’s belly. Therefore, in Braille books that are frequently read, Braille is dirty with aged use, holes open, and they collapse. In old Braille books, braille collapses and deteriorates due to pressure from left and right books in the library. Therefore, in this research, we convert Braille books into machine-readable electronic data. First, a Braille book is scanned by an image. Next, we detect braille by image recognition technology one by one from braille page. And we classify and identify Braille. Furthermore, we correct errors such as misdetection of braille and misidentification. Finally, we save the result as character code.
Similar content being viewed by others
Keywords
1 Introduction
In this section we briefly explain Braille and degraded braille.
1.1 Braille and Braille Library
The origin of Braille is old. In 1825, Braille [3] invented Braille in France. In 1879, Megata introduced Braille to Japan. In 1890, Ishikawa devised Japanese braille. This is the beginning of Japanese Braille. Since then Braille has been used as a character for the blind.
Braille consists of six embossed or flat points. These six points constitute one cell with two by three shape, which represents one character or a modifier of 63 kinds. With the modifier, characters more than 64 kinds are represented. In Japanese braille, each of voiceless sounds is represented by a plain cell and other characters such as syllabic nasals, voiced consonants, semi- voiced consonants, numerals, alphabets and so on are represented by a modifier and subsequent cell(s). In Fig. 1, Japanese Braille has a blank space for each clause like English. Some Japanese Braille are constructed with a prefix as last three examples in Fig. 1. Though Japanese is vertical writing or horizontal one, Japanese braille is horizontal writing only.
The most primitive tool for writing braille is the slate and stylus. To use it, the user presses the tip of the stylus down through the small rectangular hole to make braille dots. Therefore the user is required to punch the braille dots with mirror image in reverse order. Nowadays, for publishing a book in braille, texts are input according to the braille grammar by using a braille text editor, are revised, and are printed in braille, as Fig. 2.
In 1988, IBM opened “Tenyaku Hiroba (Field of Braille Transcription)”, that was the Braille information network system, and digitization of braille had become widespread. The books in braille before the IBM’s project are stored as they are without being converted into electronic data. Even if they have high storage value, the braille books are bulky, so braille libraries in various places are struggling to save. Each library has been forced to decide whether to discard them in recent years or to leave them.
In the earlier days when the Braille code was still in the research phase and there was no Windows, no personal computer, no internet, Shimomura et al. [1] studied coding of Braille.
Braille books are stored in the braille libraries throughout the country. The braille libraries registered in National Association of Institutions of Information Service for Visually Impaired Persons have about 100 libraries. Braille books. Braille books are borrowed from the braille libraries and public libraries via the braille libraries. In addition, many braille books are provided from the WEB braille library.
1.2 Degraded Braille
Braille is a tactile reading system. One reads tangible points with a finger’s belly. Figure 3 is a typical page of Braille book. The braille books that are frequently read are dirty, holes are opened in dots, and dots collapse. Figures 4 and 5 show fresh dots and a part of a page that both normal cells and mirror images of cells, respectively. Figures 6, 7, 8 and 9 show the degraded braille cells, where the collapsed cell, the hole opened one, the dirty one and the distorted one is presented, respectively. Tears, creases and stains in pages are shown in Figs. 10, 11 and 12, respectively.
Shimomura et al. [2] also studied restoration of the degraded Braille by shadow of Braille using fuzzy theory. In that study, the extraction of Braille was made by hand and computer programs executed the determination of existence of Braille. In this project, we extract Braille from scanned page images, which contain the degraded Braille, by computer programs, recognize them and restore the pages.
2 Restoration System
In this subsection we explain our project of the restoration system for old books written in Braille.
-
(1)
Page scanning.
-
(2)
Determination of whether scanned dot is normal or mirrored.
-
(3)
Extraction of cells and classification into 63 categories.
-
(4)
Error correction by using a scanning redundancy, that one cell is read twice as the normal image and the mirror image.
-
(5)
Error correction by using Braille grammar.
-
(6)
Interpretation Braille into Japanese.
-
(7)
Error correction by using Japanese grammar.
As noted above, the system has two main components, a machine learning system for recognition of Braille and an error correction system.
2.1 Preparation of Braille Images for Machine Learning
To detect degrade cells by machine, we must prepare many image of degrade cells. We have already scanned many the old braille books with resolution of 200 dpi, extracted each cell into \(54 \times 36\) pixel by hand, and have obtained about 15000 cell images. These images were classified into 63 normal categories and 63 mirror ones. Next, we cut each call images into 6 dot areas by a tiny application program and got about 45000 dot images and 45000 flat (non dot) ones. The followings are the normal image, the mirrored dot one and the background one for the machine learning (Figs. 13, 14 and 15).
2.2 Dot Recognition by OpenCV
OpenCV is a famous computer vision library and has functions of object detection. Some detectors such as face, eye, mouth and so on are prepared, and one also can make detectors for arbitrary objects. To make the object detector, one prepares many images one wants to detect and execute the machine learning. In the machine learning, characteristics are extracted from the many images of the object, and the machine learns the characteristics. The set of images and characteristics learned by the machine is called a cascade classifier. OpenCV has some algorithm to make the classifier. In our system, we adopt train the cascade routine with LBP characteristic to search dots in scanned page.
As the cells are regularly aligned on paper, by using the gradient of dots or the edge of paper we can correct the gradient of page. The distribution of detected dots determines row lines in page.
2.3 Scanning of Old Braille Books
It is necessary to treat the old Braille books carefully because of the degradation of cells in them as mentioned above. Therefore we do not adopt flatbed scanners but adopt a noncontact type of scanner, Fujitsu ScanSnap SV600, which operates by taking an elevated view (see Fig. 16), with resolution of 600 dpi and grey scale.
If a Braille book is printed on both side and all pages are scanned, all Braille printed is scanned twice. They are read as the normal image and the mirror image. This gives us a redundancy for interpretation of cells.
The next step is the dot detection by OpenCV mentioned above. The result is given in Figs. 17 and 18.
2.4 Normal and Reverse Dot Classification by Deep Neural Network
To detect degrade cells by machine, we must prepare many image of degrade cells. We have already scanned many the old braille books with resolution of 200 dpi, extracted each cell into \(54 \times 36\) pixel by hand, and have obtained about 15000 cell images. These images were classified into 63 normal categories and 63 mirror ones. Next, we cut each call images into 6 dot areas by a tiny application program and got about 45000 dot images and 45000 flat (non dot) ones.
The deep learning is the one of technique of machine learning based on a deeply structured and hierarchical neural network. This technique is applied on the image recognition, the sound recognition and so on. In our project, we use this technique.
As usual Braille books are printed on both side, we must distinguish between the normal dots and the mirror ones. The both cell image are give again in Figs. 19 and 20. We have already obtained many dot images for them, so we use the images as reference data for the dot classification in the deep neural network.
This procedure gives us a normal dot distribution and a mirrored distribution. By converting the mirrored distribution into left-side right, we obtain the second candidate of the normal dot distribution.
2.5 Cell Classification
As Braille is regularly aligned, the coordinate of dots tells us the starting point of cells. From this point, the dot images are sequentially analyzed, recognized and classified as cells by the deep neural network learned with 15000 cell images. Figures 20, 21, 22, 23, 24, 25 and 26 are binary aligned cell images. According to the classification, a character code is assigned to the cell image and is stored into database with a page number, a line number in the page, a word number in the line and a character number in the word. If the cell image has no character code, a flag of scanning error is also stored in the database.
2.6 Error Correction by Scanning Redundancy
When a Braille book is printed on both side and all pages are scanned, all pages become to be scanned twice. The character codes obtained from the normal images and ones from mirrored images are stored in the database and the codes obtained from normal and mirror images, for examples both shown in Figs. 16 and 17, are identical each other. By collating these codes and finding discrepancies, we have possibility to correct the scanning error.
2.7 Error Correction by Braille Grammar
Braille has own grammar. The character codes stored in the database are analyzed according to the grammar. In Japanese Braille, for example;
-
A postposition “ha” is translated into “wa”.
-
There is not the postposition “ha” following numeric characters.
-
A long sound written with kana character is translated into a macron and so on. When unacceptable sequence of characters is found, a flag of Braille grammatical error is stored in the database. If a probable code can be presumed by the grammar, the code is also stored as a candidate in the database.
2.8 Error Correction by Japanese Grammar
According to Japanese grammar, we check the sequence of code, set flags of Japanese grammatical error for ungrammatical expressions and correct the codes to probable codes if possible. Typical grammatical errors are as follows;
-
A sequence of punctuation marks;
-
A sequence of contracted sounds;
-
Non-correspondent parentheses Incorrect sonant marks.
2.9 Output and Correction by Hand
Finally the codes are converted into ink-spots expressing Braille and are output (See Fig. 27). For codes with the error flag, the ink-spots and candidates of Japanese character are output, and are corrected by hand.
3 Concluding Remarks
In this paper, we explain our project to restore old Braille books by converting Braille books into machine-readable electronic data. The Braille book is scanned by an image. With the machine learning Braille are detected by image recognition technology, classified and identified. Furthermore, the error corrections are executed. Finally, the machine-readable character codes are stored.
Shimomura and colleagues have studied Braille 35 years before. The resumption of our study of Braille is because a request by the Japan Braille Library and Ishikawa Braille Library. In recent years, Braille books have been converted into electronic data and stored, and printed by computer processing. However the previous books are not converted into electronic data and are left as Braille books as it is. There are Braille books where the original is not found and without the original. The request was to convert Braille books into digital data before they became degraded and they became unreadable. Responding to this request, we resumed our research. We want to quickly restore degraded Braille books.
References
Shimomura Y, Mizuno S (1981) Translation of machine readable catalog/cataloging into braille and vice versa. Bull Kanazawa Women’s Junior Coll 23:111–117
Shimomura Y, Mizuno S, Hasegawa S (1983) Automatic translation of machine readable data into braille. The Institute of Electronics, Information and Communication Engineers, IEICE Technical report ET82-9:57–62
National Association for Providing Facilities for the Visually Impaired (2003) Braille transcription guide, 3rd edn
Acknowledgement
We would like to thank the staff of the following organizations for enabling us to research the actual condition of old Braille books: Ishikawa Association for Providing Facilities for the Visually Impaired and Japan Braille Library.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Shimomura, Y., Kawabe, H., Nambo, H., Seto, S. (2018). Construction of Restoration System for Old Books Written in Braille. In: Xu, J., Gen, M., Hajiyev, A., Cooke, F. (eds) Proceedings of the Eleventh International Conference on Management Science and Engineering Management. ICMSEM 2017. Lecture Notes on Multidisciplinary Industrial Engineering. Springer, Cham. https://doi.org/10.1007/978-3-319-59280-0_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-59280-0_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59279-4
Online ISBN: 978-3-319-59280-0
eBook Packages: EngineeringEngineering (R0)