Chapter

Professional Search in the Modern World

Volume 8830 of the series Lecture Notes in Computer Science pp 250-273

Enhancing Patent Search with Content-Based Image Retrieval

  • Stefanos VrochidisAffiliated withCentre for Research & Technology Hellas - Information Technologies Institute
  • , Anastasia MoumtzidouAffiliated withCentre for Research & Technology Hellas - Information Technologies Institute
  • , Ioannis KompatsiarisAffiliated withCentre for Research & Technology Hellas - Information Technologies Institute

* Final gross prices may vary according to local VAT.

Get Access

Abstract

Nowadays most of the patent search systems still rely upon text to provide retrieval functionalities. Recently, the intellectual property and information retrieval communities have shown great interest in patent image retrieval, which could augment the current practices of patent search. In this chapter, we present a patent image extraction and retrieval framework, which deals with patent image extraction and multimodal (textual and visual) metadata generation from patent images with a view to provide content-based search and concept-based retrieval functionalities. Patent image extraction builds upon page orientation detection and segmentation, while metadata extraction from images is based on the generation of low level visual and textual features. The content-based retrieval functionality is based on visual low level features, which have been devised to deal with complex black and white drawings. Extraction of concepts builds upon on a supervised machine learning framework realised with Support Vector Machines and a combination of visual and textual features. We evaluate the different retrieval parts of the framework by using a dataset from the footwear and the lithography domain.

Keywords

patents images retrieval concepts classification hybrid visual