Visual Descriptor Extraction From Patent Figure Captions: A Case Study of Data Efficiency Between BiLSTM and Transformer

Xin Wei, Old Dominion UniversityFollow
Jian Wu, Old Dominion UniversityFollow
Kehinde Ajayi, Old Dominion UniversityFollow
Diane Oyen

Document Type

Conference Paper

Publication Date

2022

DOI

10.1145/3529372.3533299

Publication Title

JCDL '22: Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries

Pages

20 (1-5)

Conference Name

JCDL '22: The ACM/IEEE Joint Conference on Digital Libraries in 2022, June 20-24, 2022, Cologne, Germany

Abstract

Technical drawings used for illustrating designs are ubiquitous in patent documents, especially design patents. Different from natural images, these drawings are usually made using black strokes with little color information, making it challenging for models trained on natural images to recognize objects. To facilitate indexing and searching, we propose an effective and efficient visual descriptor model that extracts object names and aspects from patent captions to annotate benchmark patent figure datasets. We compared two state-of-the-art named entity recognition (NER) models and found that with a limited number of annotated samples, the BiLSTM-CRF model outperforms the Transformer model by a significant margin, achieving an overall F1=96.60%. We further conducted a data efficiency study by varying the number of training samples and found that BiLSTM consistently beats the transformer model on our task. The proposed model is used to annotate a benchmark patent figure dataset.

Comments

This work is licensed under a Creative Commons Attribution International 4.0 License (CC BY 4.0).

Original Publication Citation

Wei, X., Wu, J., Ajayi, K., & Oyen, D. (2022). Visual descriptor extraction from patent figure captions: A case study of data efficiency between BiLSTM and transformer. JCDL '22: Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries (20, pp. 1-5). Association for Computing Machinery. https://doi.org/10.1145/3529372.3533299

Repository Citation

Wei, X., Wu, J., Ajayi, K., & Oyen, D. (2022). Visual descriptor extraction from patent figure captions: A case study of data efficiency between BiLSTM and transformer. JCDL '22: Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries (20, pp. 1-5). Association for Computing Machinery. https://doi.org/10.1145/3529372.3533299

ORCID

0000-0003-0173-4463 (Wu), 0000-0002-5124-0739 (Ajayi)

ODU Digital Commons

Computer Science Faculty Publications

Visual Descriptor Extraction From Patent Figure Captions: A Case Study of Data Efficiency Between BiLSTM and Transformer

Document Type

Publication Date

DOI

Publication Title

Pages

Conference Name

Abstract

Comments

Original Publication Citation

Repository Citation

ORCID

Included in

Search

Browse

Contribute

Links

Contact Us

ODU Digital Commons

Computer Science Faculty Publications

Visual Descriptor Extraction From Patent Figure Captions: A Case Study of Data Efficiency Between BiLSTM and Transformer

Authors

Document Type

Publication Date

DOI

Publication Title

Pages

Conference Name

Abstract

Comments

Original Publication Citation

Repository Citation

ORCID

Included in

Share

Search

Browse

Contribute

Links

Contact Us