Abstract
A name is usually used to identify persons and objects. Similarity between texts is useful for retrieving names regardless of misspelling and different spelling names. Moreover, similarity between names of health products may lead to safety issues for consumers. Although Levenshtein algorithm has been used for measuring similarities between a pair of strings, some factors may affect human perception. In this paper, effects of substring position and character similarity are taken into account. A set of experiments were done using Thai herb names collected in Thai herbal database. Similarity scores in percentage were given by six evaluators compared to the values provided by the original and modified Levenshtein algorithms. From the results, both factors have effects on human perception. For substring position, evaluators focused on substring portions between pairs of strings. When the same positions of substrings in a pair of strings are matched, more similarity scores should be given. For character similarity, groups of similar characters in Thai consonant letters are assigned the weight between 0 and 1 based on structure of Thai characters. Human perception responds to similarity on a pair of characters. The average similarity scores from evaluators were closer to our proposed Levenshtein algorithm with character similarity. In conclusion, similarities calculated from original Levenshtein algorithm should be adjusted based on substring position and character similarity.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Emmerton, L.M., Rizk, M.F.: Look-alike and sound-alike medicines: risks and ‘solutions’. Int. J. Clin. Pharm. 34, 4–8 (2012)
Schnoor, J., Rogalski, C., Frontini, R., Engelmann, N., Heyde, C.-E.: Case report of a medication error by look-alike packaging: a classic surrogate marker of an unsafe system. Patient Saf. Surg. 9, 12 (2015)
Basco Jr., W.T., Garner, S.S., Ebeling, M., Freeland, K.D., Hulsey, T.C., Simpson, K.: Evaluating the potential severity of look-alike, sound-alike drug substitution errors in children. Acad. Pediatr. 16, 183–191 (2016)
Lertnattee, V., Wangwattana, B.: Using informatics courses to support learning in herbal medicine. Adv. Sci. Lett. 24, 8467–8470 (2018)
Paton, A., Allkin, R., Belyaeva, I., Dauncey, E., Govaerts, R., Edwards, S., Irving, J., Leon, C., Nic, E.: Plant name resources: building bridges with users. Botanists 207, 1–11 (2016)
McCreath, S.B., Delgoda, R.: Pharmacognosy: Fundamentals, Applications and Strategies. Academic Press, Cambridge (2017)
Pakdeesattayapong, D., Lertnattee, V.: Correcting and standardizing crude drug names in traditional medicine formulae by ensemble of string matching techniques. In: International Conference on Intelligent Computing, pp. 237–247. Springer, Heidelberg (2015)
Singla, N., Garg, D.: String matching algorithms and their applicability in various applications. Int. J. Soft Comput. Eng. 1, 218–222 (2012)
Peng, T., Li, L., Kennedy, J.: A comparison of techniques for name matching. GSTF J. Comput. (JoC) 2, 55–61 (2018)
Kondrak, G.: N-gram similarity and distance. In: International Symposium on String Processing and Information Retrieval, pp. 115–126. Springer, Heidelberg (2005)
McCoy, R.T., Frank, R.: Phonologically informed edit distance algorithms for word alignment with low-resource languages. In: Proceedings of the Society for Computation in Linguistics (SCiL), pp. 102–112 (2018)
Chochiang, K.: L2D: a modified algorithm based on edit distance for searching thai-english transliterated words. In: Proceedings of the 2017 International Conference on Computer Science and Artificial Intelligence, pp. 242–246. ACM (2017)
Acknowledgments
This work was partially supported by the Research and Creative Fund, Faculty of Pharmacy, Silpakorn University.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lertnattee, V., Paluekpet, T. (2019). Effects of Substring Position and Character Similarity on Human Perception of Thai Herb Name Similarity. In: Othman, M., Abd Aziz, M., Md Saat, M., Misran, M. (eds) Proceedings of the 3rd International Symposium of Information and Internet Technology (SYMINTECH 2018). SYMINTECH 2018. Lecture Notes in Electrical Engineering, vol 565. Springer, Cham. https://doi.org/10.1007/978-3-030-20717-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-20717-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20716-8
Online ISBN: 978-3-030-20717-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)