Abstract
In this work, a computer vision algorithm for the detection and recognition of 2D kinematic diagrams, both from paper schemes and digital files, was developed. Furthermore, it runs even with hand-made diagrams, which can be correctly identified. The algorithm is mainly based on the use of the free computer vision library OpenCV, being able to identify each element of the kinematic diagram, its connection with the other elements and store its pixels, which will allow in future research the implementation of motion in the sketches themselves. Allowed elements are revolute, prismatic, fixed, cylindrical and rigid joints and rigid bars. The main applications of this work are focused on the teaching world, communication of ideas in a quickly and graphical way and for fast and preliminary designs of new mechanisms as people can draw the diagram in a Tablet or paper and simulate it in real time, avoiding the necessity to learn how to operate a specialized simulation software and the time it takes to prepare the virtual model and obtain its results.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
The design of mechanisms is a problem of great interest in the fields of mechanics and robotics. For this reason, kinematic diagrams or schemes are essential to represent these designs. In addition, hand-made drawings present advantages in terms of efficiency, visualization and communication of new ideas in mechanical design or education [1,2,3]. Nevertheless, analyzing the kinematics on the diagrams per se can be complicated and using a specialized software for simulation requires time to familiarize with it, create virtual models and calculate solutions [1, 3]. Consequently, it would be really interesting that virtual design and simulation tools could interact with those freehand drawings in order to optimize the process of solving kinematics and dynamics [1].
On the other hand, computer vision is increasingly present in society. Computer vision techniques allow the detection of different types of shapes and patterns automatically, with multiple technological and industrial applications in fields as diverse as military, automotive, agriculture, medicine, security and surveillance, etc. [4].
The aim of this work is to couple kinematic diagram drawings, both on digital board and on paper, with virtual design and simulation tools. Therefore, it is proposed an algorithm that, using tools from the open-source library OpenCV, recognizes kinematic diagrams from digital files (JPG or PNG) and extracts the necessary data to implement in simulations.
Currently, the state of the art for these applications is scarce. In [2] and [3], authors propose and test the recognition of kinematic diagrams composed by revolute joints and rigid bars using computer vision methods and a multi-objective optimization algorithm, specifically the NSGA-II. On the other hand, in [1] and [5], different freehand engineering symbols and diagrams are recognized applying convolutional neuronal networks (CNN). The algorithm presented in this paper, however, stands out for its simplicity. It is based on finding rectangles, circumferences and segments to subsequently associate these shapes with the elements that constitute kinematic diagrams, distinguishing between rigid bars and guides (straight lines), rigid joints (convergence of at least two rigid bars), fixed joints (three close short lines), revolute joints (circumferences), prismatic joints (rectangles) and cylindrical joints (rectangles with a circumference inside) (see Fig. 1). To achieve this, mathematical morphology operations and other basic computer vision and post-processing techniques are used.
Among this symbology, important elements are not contemplated in this first research. For the specific case of a revolute joint with a fixed joint, as an alternative, the formulation showed in Fig. 2 is proposed.
Lastly, the pixels of each element of the diagram are stored separately, being not only capable of generating a virtual diagram which represents the original one for the calculation of kinematics, but also the kinematics could be solved on the drawing itself, being able to create animations with it.
2 Methodology
In this section the proposed method will be explained. Figure 3 shows a flowchart of the operation of the algorithm.
The first step is to import a RGB image with the kinematic diagram and convert it into a binary one. Then, some basic image filters and morphological operators to clean the noise and enhance certain properties can be applied [6]. Moreover, a size filter for symbols will be calculated considering the stroke width and the image dimensions.
After that brief pre-processing, the algorithm itself can start its process. Firstly, closed contours, which will correspond to circumferences and rectangles, are located by Minimum Bounding Rectangle (MBR) techniques, which can inscribe figures using the minimum possible area. This is a common strategy to store approximations of objects that can fulfill certain characteristics that will be analyzed in future steps [7]. For this reason, to ensure all figures are closed, the most common practice will be, at least, to introduce an initial dilation. In addition, outer contours could encompass several symbols, therefore only inner contours will be analyzed. Inner contours may be broken for different causes, as, for example, if a prismatic or cylindrical joint is drawn on a rigid bar (guide) and this guide breaks the joint in two parts. To solve this drawback, nearby MBRs are unified.
Then, a first classification is applied. A MBR can enclose a circumference (revolute joint), a rectangle (prismatic joint) or both shapes at the same time (cylindrical joint).
Before beginning the fixed joints recognition, all the symbols found in the previous step will be erased in order to simplify the image and avoid false positives. Fixed joints will be detected from the drawing by analyzing the density of stroke pixels as, after deleting the other figures, the higher density of points will belong to them.
At this point, only straight lines remain without identifying. They can represent rigid bars, guides or rigid joints. Since the 60s, the main form of detecting lines is the Hough Transform (HT) and its later more sophisticated variants [6, 8]. In this work it was used the Progressive Probabilistic Hough Transform (PPHT), which is implemented in OpenCV [9], being more effective and less time-consuming than the traditional Hough Transform [10]. Due to the fact that line strokes are more than one pixel width and they are not completely straight as they can be hand-made, if the PPHT was applied, the result would have duplicate and broken lines [11, 12]. A skeletonization will remove the repeated information [6] before executing the PPHT and for avoiding broken lines, considering the initial and final points of the obtained lines, if slopes are similar, they will be joined together forming a single long line, but if slopes are different, a rigid joint will be identified [11, 12]. Once lines are detected, connections between elements can be established analyzing the position of the ends of the lines. Finally, guides will be recognized when a rigid bar has the same slope than a prismatic or cylindrical joint and intersection points are found between them.
Once all the elements are recognized, pixel extraction is easy: element by element, they are isolated in a ROI (region of interest) in which all the stroke pixels will belong to that figure, hence the only necessary operation will be storing the position of those pixels with respect to the global image.
2.1 Stroke Dilations: Structuring Element Size
Stroke dilations are essential for the correct functioning of the algorithm. Until four dilations are applied, being the only mandatory parameters to adjust. The first one is applied before finding the MBRs in order to close possible openings in figures, as it was explained above. The last three are intended to improve the capability of detection for circumferences, rectangles and fixed joints, respectively, since hand-made drawings can present some imperfections that must be removed.
In order to obtain the stroke dilations, rectangle structuring elements formed by square matrices are applied. The higher the value of the size of these matrices is, the wider the dilation of the stroke will be.
3 Results
The results are acquired after a series of tests carried out on a set of drawings made both in digital whiteboard and paper. These drawings were generated by different researchers, using different colors and strokes. Below a series of examples and their corresponding results are shown (Fig. 4) together with a table with the structuring element sizes used in each case (Table 1).
It is remarkable to say that although there are infinite positive integers to establish the structuring element size, it was not necessary to apply any number up to 9. Likewise, even though the values of the tables are highly variables, knowing the algorithm execution order, the calibration process is very intuitive. Firstly, if the original image figures are completely closed, then the first dilation will be small. Then, if the detection of the circumferences fails, a higher value of the circumference dilation will be needed. Subsequently, the same process must be followed for the detection of rectangles and fixed joints.
3.1 Algorithm Limitations
Due to the simplicity and novelty of the algorithm, there are some limitations that future research will try to solve.
-
The use of mechanical symbols not contemplated in this article is not allowed. The algorithm could collapse or show wrong results.
-
Cross bars without contact cannot be used. In this case, the algorithm will correctly detect all the elements of the mechanism. However, it will fail establishing the connections between the elements, interpreting cross bars as if they had a rigid joint between them.
-
Images with more than 1,5 million of pixels should be avoided. Some errors were found during experiments if pictures were too big. This is because the size filter applied in symbols based on the image dimensions and the stroke width was not intended for this type of images.
4 Conclusions
It was shown how the recognition of freehand kinematic diagrams, both from digital devices and paper drawings, is possible even in real time. The proposed algorithm is fast, simple and could be run on any electronic device, such as a tablet or computer. Future research will seek to improve the robustness and reliability of the algorithm, while incorporating new, more complex symbology. Finally, although in this article only kinematic schemes previously drawn were considered, in parallel work an algorithm which allows the identification of each element as it is drawn on a digital whiteboard is being developed. This will allow the correct identification of more difficult geometries and kinematic diagrams, being the final idea to merge both algorithms into a single one and, therefore, creating a software with much more potential.
References
Eicholtz, M., Kara, L.B.: Intermodal image-based recognition of planar kinematic mechanisms. J. Vis. Lang. Comput. 27, 38–48 (2015). https://doi.org/10.1016/j.jvlc.2014.10.024
Fu, L., Kara, L.B.: Recognizing network-like hand-drawn sketches: a convolutional neural network approach. In: Proceedings of the ASME 2009 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, vol. 5, pp. 671–681. ASME, San Diego (2009). https://doi.org/10.1115/DETC2009-87402
Eicholtz, M., Kara, L.B., Lohn, J.: Recognizing planar kinematic mechanisms from a single image using evolutionary computation. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation (GECCO 2014), New York, pp. 1103–1110 (2014). https://doi.org/10.1145/2576768.2598354
Batchelor, B.G.: Machine Vision Handbook, 1st edn. Springer, London (2012). ISBN 978-1-84996-168-4
Fu, L., Kara, L.B.: From engineering diagrams to engineering models: visual recognition and applications. J. Comput.-Aided Design 43, 278–292 (2011). https://doi.org/10.1016/j.cad.2010.12.011
Davies, E.R.: Computer and Machine Vision, 4th edn. Elsevier (2012). ISBN: 978-0-12-386908-1
Papadias, D., Theodoridis, Y., Sellis, T., Egenhofer, M.J.: Topological relations in the world of minimum bounding rectangles: a study with R-trees. In: Proceedings of the 1995 ACM SIGMOD international conference on Management of data, pp. 92–103 (1995). https://doi.org/10.1145/223784.223798
Kälviäinen, H., Hirvonen, P., Xu, L., Oja, E.: Probabilistic and non-probabilistic Hough transforms: overview and comparisons. J. Image Vision Comput. 13, 239–252 (1995). https://doi.org/10.1016/0262-8856(95)99713-B
Hough Line Transform, OpenCV. https://docs.opencv.org/4.x/d6/d10/tutorial_py_houghlines.html
Matas, J., Galambos, C., Kittler, J.: Robust detection of lines using the progressive probabilistic Hough transform. J. Comput. Vision Image Underst. 78, 119–137 (2000). https://doi.org/10.1006/cviu.1999.0831
Samarasekara, N.: Sports analysis using video tracking. Moratuwa (2015). https://doi.org/10.13140/RG.2.2.34195.78883
Lowe, D.G.: Three-dimensional object recognition from single two-dimensional images. J. Artif. Intell. 31, 355–395 (1987). https://doi.org/10.1016/0004-3702(87)90070-1
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2023 The Author(s)
About this paper
Cite this paper
Fontenla-Carrera, G., Fernández Vilán, Á.M., Izquierdo Belmonte, P. (2023). Automatic Identification of Kinematic Diagrams with Computer Vision. In: Vizán Idoipe, A., García Prada, J.C. (eds) Proceedings of the XV Ibero-American Congress of Mechanical Engineering. IACME 2022. Springer, Cham. https://doi.org/10.1007/978-3-031-38563-6_62
Download citation
DOI: https://doi.org/10.1007/978-3-031-38563-6_62
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-38562-9
Online ISBN: 978-3-031-38563-6
eBook Packages: EngineeringEngineering (R0)