Abstract
The following work investigates the subject of using GPGPU technology for natural language processing. Natural language processing involves analysing very large volumes of data based on sophisticated algorithms. This process can only be performed on computers with significant computing power. Parallel computing and utilisation of the processing capacity of graphics cards can help achieve the above requirements. The work presents the problem of building n-gram models of natural language based on specific text. Two algorithms were developed: a sequential one for a typical CPU and a parallel one, which uses the capacity of a GPU. The GPU algorithm was prepared using Nvidia CUDA technology. Experiments were carried out in order to compare the effectiveness of the developed algorithms depending on the size of the analysed text and the number of words in the n-grams. The results showed that a parallel type algorithm is better for a GPU environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ghorpade, J., Parande, J., Kulkarni, M., Bawaskar, A.: GPGPU processing in CUDA architecture. Adv. Comput.: Int. J. 3(1), 105–120 (2012)
Gupta, S., Rajasekhara, M.B.: Performance analysis of GPU compared to single-core and multi-core CPU for natural language applications. Int. J. Adv. Comput. Sci. Appl. 2(5), 50–53 (2011)
Jurafsky, D., Martin, J.H.: Speech and Language Processing. Pearson Prentice Hall, New Jersey (2008)
Nagao, M., Mori, S.: A new method of N-gram statistics for large number of n and automatic extraction of words and phrases from large text data of Japanese. In: COLING 1994. vol. 1, pp. 611–615. Kyoto, Japan (1994)
NVidia: CUDA C Programming Guide ver. 5.0 (2012)
NVidia: CUFFT Library User Guide ver. 5.0 (2012)
Shiwon, C., Dong-Wook, L.: High-performance Korean morphological analyzer using the mapreduce framework on the GPU. J. Electr. Eng. Technol. 6(4), 573–579 (2011)
Xiwu, G., Ruixuan, L., Kunmei, W., Bei, P., Weijun, X.: A GPU-based accelerator for Chinese word segmentation. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) Web Technologies and Applications. LNCS, vol. 7235, pp. 231–242. Springer, Berlin (2012)
Youngmin, Y., Chao-Yue, L., Slav, P., Keutzer, K.: Efficient parallel CKY parsing on GPUs. In: IWPT 2011. pp. 175–185. Dublin, Ireland (2011)
Acknowledgments
This work was financed by Ministry of Science and Higher Education in Poland (research project no. N N516 499139).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Banasiak, D. (2016). Statistical Methods of Natural Language Processing on GPU. In: Gruca, A., Brachman, A., Kozielski, S., Czachórski, T. (eds) Man–Machine Interactions 4. Advances in Intelligent Systems and Computing, vol 391. Springer, Cham. https://doi.org/10.1007/978-3-319-23437-3_51
Download citation
DOI: https://doi.org/10.1007/978-3-319-23437-3_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23436-6
Online ISBN: 978-3-319-23437-3
eBook Packages: EngineeringEngineering (R0)