IPC Multi-label Classification Applying the Characteristics of Patent Documents

Lim, Sora; Kwon, YongJin

doi:10.1007/978-981-10-3023-9_27

IPC Multi-label Classification Applying the Characteristics of Patent Documents

Sora Lim⁵ &
YongJin Kwon⁵

Conference paper
First Online: 23 November 2016

2684 Accesses
1 Citations

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 421))

Abstract

Most of research on the IPC automatic classification system has focused on applying various existing machine learning methods to the patent documents rather than considering the characteristics of the data or the structure of the patent documents. This paper, therefore, proposes using two structural fields, a technical field and a background field which are selected by applying the characteristics of patent documents and the role of the structural fields. A multi-label classification model is also constructed to reflect that a patent document could have multiple IPCs and to classify patent documents at an IPC subclass level comprised of 630 categories. The effects of the structural fields of the patent documents are examined using 564,793 registered patents in Korea. An 87.2 % precision rate is obtained when using the two fields mainly. From this sequence, it is verified that the technical field and background field play an important role in improving the precision of IPC multi-label classification at the IPC subclass level.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Choi, D.K.: Intellectual Property Statistics for 2014, Korean Intellectual Property Office (2015)
Google Scholar
Seneviratne, D., Geva, S., Zuccon, G., Ferraro, G., Chappell, T., Meireles, M.: A signature approach to patent classification. In: Zuccon, G., Geva, S., Joho, H., Scholer, F., Sun, A., Zhang, P. (eds.) AIRS 2015. LNCS, vol. 9460, pp. 413–419. Springer, Heidelberg (2015). doi:10.1007/978-3-319-28940-3_35
Chapter Google Scholar
Kim, J.-H., Choi, K.-S.: Patent document categorization based on semantic structural information. Inf. Process. Manage. 43(5), 1200–1215 (2007)
Article Google Scholar
Larkey, L.S.: A patent search and classification system. In: The 4th ACM Conference on Digital Libraries, pp. 119–187. ACM (1999)
Google Scholar
Fall, C.J., Törcsvári, A., Benzineb, K., Karetka, G.: Automated categorization in the international patent classification. ACM SIGIR Forum 37(1), 10–25 (2003). ACM
Article Google Scholar
Tikk, D., Biró, G., Törcsvári, A.: A hierarchical online classifier for patent categorization. In: Emerging Technologies of Text Mining: Techniques and Applications, pp. 244–267 (2007)
Google Scholar
Chen, Y.-L., Chang, Y.-C.: A three-phase method for patent classification. Inf. Process. Manage. 48(6), 1017–1030 (2012)
Article Google Scholar
Park, C., Kim, K., Seong, D.: Automatic IPC classification for patent documents of convergence technology using KNN. J. KIIT 12(3), 175–185 (2014)
Google Scholar
International Patent Classification Guide. http://www.wipo.int/export/sites/www/classifications/ipc/en/guide/guide_ipc.pdf
KIPRIS (Korea Intellectual Property Rights Information Service) plus. http://plus.kipris.or.kr/
KLT2000, Korean Morphological Analyzer. http://nlp.kookmin.ac.kr/
Zhang, H.: The optimality of Naive Bayes. In: Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference. AAAI Press, Miami Beach (2004)
Google Scholar
Kibriya, A.M., Frank, E., Pfahringer, B., Holmes, G.: Multinomial Naive Bayes for text categorization revisited. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 488–499. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30549-1_43
Chapter Google Scholar
Buitinck, L., Louppe, G., Blondel, M., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop on Languages for Machine Learning (2013)
Google Scholar

Download references

Acknowledgments

This research was supported by Gyeonggi Province’s GRRC Program [(GRRC-B01), Development of Ambient Mobile Broadcasting Service System].

Author information

Authors and Affiliations

Korea Aerospace University, Goyang, Korea
Sora Lim & YongJin Kwon

Authors

Sora Lim
View author publications
You can also search for this author in PubMed Google Scholar
YongJin Kwon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sora Lim .

Editor information

Editors and Affiliations

Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea (Republic of)
James J. (Jong Hyuk) Park
Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
Yi Pan
Computer Science and Engineering, Gangneung-Wonju National University, Wonju, Korea (Republic of)
Gangman Yi
University Salerno, Fisciano, Italy
Vincenzo Loia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lim, S., Kwon, Y. (2017). IPC Multi-label Classification Applying the Characteristics of Patent Documents. In: Park, J., Pan, Y., Yi, G., Loia, V. (eds) Advances in Computer Science and Ubiquitous Computing. UCAWSN CUTE CSA 2016 2016 2016. Lecture Notes in Electrical Engineering, vol 421. Springer, Singapore. https://doi.org/10.1007/978-981-10-3023-9_27

Download citation

DOI: https://doi.org/10.1007/978-981-10-3023-9_27
Published: 23 November 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3022-2
Online ISBN: 978-981-10-3023-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics