Skip to main content

IPC Multi-label Classification Applying the Characteristics of Patent Documents

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 421))

Abstract

Most of research on the IPC automatic classification system has focused on applying various existing machine learning methods to the patent documents rather than considering the characteristics of the data or the structure of the patent documents. This paper, therefore, proposes using two structural fields, a technical field and a background field which are selected by applying the characteristics of patent documents and the role of the structural fields. A multi-label classification model is also constructed to reflect that a patent document could have multiple IPCs and to classify patent documents at an IPC subclass level comprised of 630 categories. The effects of the structural fields of the patent documents are examined using 564,793 registered patents in Korea. An 87.2 % precision rate is obtained when using the two fields mainly. From this sequence, it is verified that the technical field and background field play an important role in improving the precision of IPC multi-label classification at the IPC subclass level.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Choi, D.K.: Intellectual Property Statistics for 2014, Korean Intellectual Property Office (2015)

    Google Scholar 

  2. Seneviratne, D., Geva, S., Zuccon, G., Ferraro, G., Chappell, T., Meireles, M.: A signature approach to patent classification. In: Zuccon, G., Geva, S., Joho, H., Scholer, F., Sun, A., Zhang, P. (eds.) AIRS 2015. LNCS, vol. 9460, pp. 413–419. Springer, Heidelberg (2015). doi:10.1007/978-3-319-28940-3_35

    Chapter  Google Scholar 

  3. Kim, J.-H., Choi, K.-S.: Patent document categorization based on semantic structural information. Inf. Process. Manage. 43(5), 1200–1215 (2007)

    Article  Google Scholar 

  4. Larkey, L.S.: A patent search and classification system. In: The 4th ACM Conference on Digital Libraries, pp. 119–187. ACM (1999)

    Google Scholar 

  5. Fall, C.J., Törcsvári, A., Benzineb, K., Karetka, G.: Automated categorization in the international patent classification. ACM SIGIR Forum 37(1), 10–25 (2003). ACM

    Article  Google Scholar 

  6. Tikk, D., Biró, G., Törcsvári, A.: A hierarchical online classifier for patent categorization. In: Emerging Technologies of Text Mining: Techniques and Applications, pp. 244–267 (2007)

    Google Scholar 

  7. Chen, Y.-L., Chang, Y.-C.: A three-phase method for patent classification. Inf. Process. Manage. 48(6), 1017–1030 (2012)

    Article  Google Scholar 

  8. Park, C., Kim, K., Seong, D.: Automatic IPC classification for patent documents of convergence technology using KNN. J. KIIT 12(3), 175–185 (2014)

    Google Scholar 

  9. International Patent Classification Guide. http://www.wipo.int/export/sites/www/classifications/ipc/en/guide/guide_ipc.pdf

  10. KIPRIS (Korea Intellectual Property Rights Information Service) plus. http://plus.kipris.or.kr/

  11. KLT2000, Korean Morphological Analyzer. http://nlp.kookmin.ac.kr/

  12. Zhang, H.: The optimality of Naive Bayes. In: Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference. AAAI Press, Miami Beach (2004)

    Google Scholar 

  13. Kibriya, A.M., Frank, E., Pfahringer, B., Holmes, G.: Multinomial Naive Bayes for text categorization revisited. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 488–499. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30549-1_43

    Chapter  Google Scholar 

  14. Buitinck, L., Louppe, G., Blondel, M., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop on Languages for Machine Learning (2013)

    Google Scholar 

Download references

Acknowledgments

This research was supported by Gyeonggi Province’s GRRC Program [(GRRC-B01), Development of Ambient Mobile Broadcasting Service System].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sora Lim .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Lim, S., Kwon, Y. (2017). IPC Multi-label Classification Applying the Characteristics of Patent Documents. In: Park, J., Pan, Y., Yi, G., Loia, V. (eds) Advances in Computer Science and Ubiquitous Computing. UCAWSN CUTE CSA 2016 2016 2016. Lecture Notes in Electrical Engineering, vol 421. Springer, Singapore. https://doi.org/10.1007/978-981-10-3023-9_27

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3023-9_27

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3022-2

  • Online ISBN: 978-981-10-3023-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics