Classification of the Structure of Square Hmong Characters and Analysis of Its Statistical Properties
Analysis of the character structure characteristics can lay an information foundation for the intelligent processing of square Hmong characters. Combined with the analysis of character structure characteristics, this paper presents a definition of the linearization of square Hmong characters, a definition of equivalence class division of the structure of square Hmong characters, and proposes a decision algorithm of structure equivalence class. According to the above algorithm, the structure of square Hmong characters is divided into eight equivalent classes. Analysis of the statistical properties, including the cumulative probability distribution, complexity, and information entropy of square Hmong characters appearing in practical documents, shows that, first, more than 90% of square Hmong characters appearing in practical documents are composed of two components, and more than 80% of these characters possess a left-right, top-bottom, or lower-left-enclosed structure, second, the number of mean components in a square Hmong character is slightly greater than 2, third, the information entropy of the structure of Hmong characters is within the interval (1.19, 2.16). Results reveal that square Hmong characters appearing frequently in practical documents follow the principle of simple structure orientation.
KeywordsInformation entropy Probability distribution Square Hmong character Statistical analysis
This work is supported by the National Natural Science Foundation of China (Nos. 61462029 and 61741205).
- 1.Instituse, R.: A fast smoothing & thinning method based on character structure. J. Chin. Inf. Process. 4(2), 49–55 (1990)Google Scholar
- 6.Ai, J.Y., Yu, H.Z., Li, Y.H.: Statistical analysis on Tibetan shaped structure. J. Comput. Appl. 29(7), 2029–2031 (2009)Google Scholar
- 7.Cai, Z.J., CaiRang, Z.M.: Research on the distribution of Tibetan character forms. J. Chin. Inf. Process. 30(4), 98–105 (2016)Google Scholar
- 8.Kwon, Y.B.: Hangul tree classifier for type clustering using horizontal and vertical strokes. In: Proceedings of the 16th International Conference on Pattern Recognition, pp. 228–231. IEEE, Quebec City (2002)Google Scholar
- 9.Xu, R.J., Liu, C.P.: Grapheme segmentation and recognition in machine printed Hangul characters. J. Chin. Inf. Process. 20(2), 66–71 (2006)Google Scholar
- 10.Cui, R.Y., Kim, S.J.: Research on information structure of Korean characters. J. Chin. Inf. Process. 25(5), 114–119 (2011)Google Scholar
- 11.Mo, L.P., Zhou, K.Q.: Formal description of dynamic construction method for square Hmong language characters. J. Comput. Appl. 34(3), 861–864, 868 (2014)Google Scholar
- 12.Mo, L.P., Zhou, K.Q., Jiang, X.H.: Research on square Hmong language characters fonts based on OpenType technology. J. Chin. Inf. Process. 129(2), 150–156 (2015)Google Scholar
- 14.Zhao, L.M., Liu, Z.Q.: Xiangxi square Hmong characters. Minor. Lang. China 12(1), 44–49 (1990)Google Scholar
- 15.Yang, Z.B., Luo, H.Y.: On the folk coinage of characters of the Miao People in Xiangxi area. J. Jishou Univ. (Soc. Sci. Edn.) 29(6), 130–134 (2008)Google Scholar
- 16.Long, Z.H.: Re-study of the coinage method of square characters of Miao language in the Youshui river basin of Yu, Xiang and E. J. Chongqing Educ. Coll. 25(5), 56–59 (2012)Google Scholar