Keywords

1 Introduction

The design of mechanisms is a problem of great interest in the fields of mechanics and robotics. For this reason, kinematic diagrams or schemes are essential to represent these designs. In addition, hand-made drawings present advantages in terms of efficiency, visualization and communication of new ideas in mechanical design or education [1,2,3]. Nevertheless, analyzing the kinematics on the diagrams per se can be complicated and using a specialized software for simulation requires time to familiarize with it, create virtual models and calculate solutions [1, 3]. Consequently, it would be really interesting that virtual design and simulation tools could interact with those freehand drawings in order to optimize the process of solving kinematics and dynamics [1].

On the other hand, computer vision is increasingly present in society. Computer vision techniques allow the detection of different types of shapes and patterns automatically, with multiple technological and industrial applications in fields as diverse as military, automotive, agriculture, medicine, security and surveillance, etc. [4].

The aim of this work is to couple kinematic diagram drawings, both on digital board and on paper, with virtual design and simulation tools. Therefore, it is proposed an algorithm that, using tools from the open-source library OpenCV, recognizes kinematic diagrams from digital files (JPG or PNG) and extracts the necessary data to implement in simulations.

Currently, the state of the art for these applications is scarce. In [2] and [3], authors propose and test the recognition of kinematic diagrams composed by revolute joints and rigid bars using computer vision methods and a multi-objective optimization algorithm, specifically the NSGA-II. On the other hand, in [1] and [5], different freehand engineering symbols and diagrams are recognized applying convolutional neuronal networks (CNN). The algorithm presented in this paper, however, stands out for its simplicity. It is based on finding rectangles, circumferences and segments to subsequently associate these shapes with the elements that constitute kinematic diagrams, distinguishing between rigid bars and guides (straight lines), rigid joints (convergence of at least two rigid bars), fixed joints (three close short lines), revolute joints (circumferences), prismatic joints (rectangles) and cylindrical joints (rectangles with a circumference inside) (see Fig. 1). To achieve this, mathematical morphology operations and other basic computer vision and post-processing techniques are used.

Fig. 1.
figure 1

Allowed symbology. From left to right: revolute, fixed, prismatic, cylindrical and rigid joints.

Among this symbology, important elements are not contemplated in this first research. For the specific case of a revolute joint with a fixed joint, as an alternative, the formulation showed in Fig. 2 is proposed.

Fig. 2.
figure 2

Symbology of a revolute joint attached to a fixed joint (left). Alternative proposed (right).

Lastly, the pixels of each element of the diagram are stored separately, being not only capable of generating a virtual diagram which represents the original one for the calculation of kinematics, but also the kinematics could be solved on the drawing itself, being able to create animations with it.

2 Methodology

In this section the proposed method will be explained. Figure 3 shows a flowchart of the operation of the algorithm.

Fig. 3.
figure 3

Algorithm flowchart.

The first step is to import a RGB image with the kinematic diagram and convert it into a binary one. Then, some basic image filters and morphological operators to clean the noise and enhance certain properties can be applied [6]. Moreover, a size filter for symbols will be calculated considering the stroke width and the image dimensions.

After that brief pre-processing, the algorithm itself can start its process. Firstly, closed contours, which will correspond to circumferences and rectangles, are located by Minimum Bounding Rectangle (MBR) techniques, which can inscribe figures using the minimum possible area. This is a common strategy to store approximations of objects that can fulfill certain characteristics that will be analyzed in future steps [7]. For this reason, to ensure all figures are closed, the most common practice will be, at least, to introduce an initial dilation. In addition, outer contours could encompass several symbols, therefore only inner contours will be analyzed. Inner contours may be broken for different causes, as, for example, if a prismatic or cylindrical joint is drawn on a rigid bar (guide) and this guide breaks the joint in two parts. To solve this drawback, nearby MBRs are unified.

Then, a first classification is applied. A MBR can enclose a circumference (revolute joint), a rectangle (prismatic joint) or both shapes at the same time (cylindrical joint).

Before beginning the fixed joints recognition, all the symbols found in the previous step will be erased in order to simplify the image and avoid false positives. Fixed joints will be detected from the drawing by analyzing the density of stroke pixels as, after deleting the other figures, the higher density of points will belong to them.

At this point, only straight lines remain without identifying. They can represent rigid bars, guides or rigid joints. Since the 60s, the main form of detecting lines is the Hough Transform (HT) and its later more sophisticated variants [6, 8]. In this work it was used the Progressive Probabilistic Hough Transform (PPHT), which is implemented in OpenCV [9], being more effective and less time-consuming than the traditional Hough Transform [10]. Due to the fact that line strokes are more than one pixel width and they are not completely straight as they can be hand-made, if the PPHT was applied, the result would have duplicate and broken lines [11, 12]. A skeletonization will remove the repeated information [6] before executing the PPHT and for avoiding broken lines, considering the initial and final points of the obtained lines, if slopes are similar, they will be joined together forming a single long line, but if slopes are different, a rigid joint will be identified [11, 12]. Once lines are detected, connections between elements can be established analyzing the position of the ends of the lines. Finally, guides will be recognized when a rigid bar has the same slope than a prismatic or cylindrical joint and intersection points are found between them.

Once all the elements are recognized, pixel extraction is easy: element by element, they are isolated in a ROI (region of interest) in which all the stroke pixels will belong to that figure, hence the only necessary operation will be storing the position of those pixels with respect to the global image.

2.1 Stroke Dilations: Structuring Element Size

Stroke dilations are essential for the correct functioning of the algorithm. Until four dilations are applied, being the only mandatory parameters to adjust. The first one is applied before finding the MBRs in order to close possible openings in figures, as it was explained above. The last three are intended to improve the capability of detection for circumferences, rectangles and fixed joints, respectively, since hand-made drawings can present some imperfections that must be removed.

In order to obtain the stroke dilations, rectangle structuring elements formed by square matrices are applied. The higher the value of the size of these matrices is, the wider the dilation of the stroke will be.

3 Results

The results are acquired after a series of tests carried out on a set of drawings made both in digital whiteboard and paper. These drawings were generated by different researchers, using different colors and strokes. Below a series of examples and their corresponding results are shown (Fig. 4) together with a table with the structuring element sizes used in each case (Table 1).

It is remarkable to say that although there are infinite positive integers to establish the structuring element size, it was not necessary to apply any number up to 9. Likewise, even though the values of the tables are highly variables, knowing the algorithm execution order, the calibration process is very intuitive. Firstly, if the original image figures are completely closed, then the first dilation will be small. Then, if the detection of the circumferences fails, a higher value of the circumference dilation will be needed. Subsequently, the same process must be followed for the detection of rectangles and fixed joints.

Fig. 4.
figure 4

Examples of hand-made kinematic diagrams (A) and their identification (B).

Table 1. Structuring element sizes used for diagram detection in Fig. 4

3.1 Algorithm Limitations

Due to the simplicity and novelty of the algorithm, there are some limitations that future research will try to solve.

  • The use of mechanical symbols not contemplated in this article is not allowed. The algorithm could collapse or show wrong results.

  • Cross bars without contact cannot be used. In this case, the algorithm will correctly detect all the elements of the mechanism. However, it will fail establishing the connections between the elements, interpreting cross bars as if they had a rigid joint between them.

  • Images with more than 1,5 million of pixels should be avoided. Some errors were found during experiments if pictures were too big. This is because the size filter applied in symbols based on the image dimensions and the stroke width was not intended for this type of images.

4 Conclusions

It was shown how the recognition of freehand kinematic diagrams, both from digital devices and paper drawings, is possible even in real time. The proposed algorithm is fast, simple and could be run on any electronic device, such as a tablet or computer. Future research will seek to improve the robustness and reliability of the algorithm, while incorporating new, more complex symbology. Finally, although in this article only kinematic schemes previously drawn were considered, in parallel work an algorithm which allows the identification of each element as it is drawn on a digital whiteboard is being developed. This will allow the correct identification of more difficult geometries and kinematic diagrams, being the final idea to merge both algorithms into a single one and, therefore, creating a software with much more potential.