ClaimDistiller: Scientific Claim Extraction with Supervised Contrastive Learning

Xin Wei, Old Dominion UniversityFollow
Md Reshad Ul Hoque, Old Dominion UniversityFollow
Jian Wu, Old Dominion UniversityFollow
Jiang Li, Old Dominion UniversityFollow

Document Type

Conference Paper

Publication Date

2023

Publication Title

CEUR Workshop Proceedings: EEKE-All2023: Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2023) and AI + Informetrics (All2023): Proceedings of Joint Workshop of the 4th Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2023) and the 3rd AI + Informetrics (All2023) co-located with the JCDL 2023

Volume

3451

Pages

65-77

Conference Name

Joint Workshop of the 4th Extraction and Evaluation of Knowledge Entities from Scientific Documents and the 3rd AI + Informetrics (EEKE-AII2023), June 26, 2023, Santa Fe, New Mexico

Abstract

The growth of scientific papers in the past decades calls for effective claim extraction tools to automatically and accurately locate key claims from unstructured text. Such claims will benefit content-wise aggregated exploration of scientific knowledge beyond the metadata level. One challenge of building such a model is how to effectively use limited labeled training data. In this paper, we compared transfer learning and contrastive learning frameworks in terms of performance, time and training data size. We found contrastive learning has better performance at a lower cost of data across all models. Our contrastive-learning-based model ClaimDistiller has the highest performance, boosting the F1 score of the base models by 3–4%, and achieved an F1=87.45%, improving the state-of-the-art by more than 7% on the same benchmark data previously used for this task. The same phenomenon is observed on another benchmark dataset, and ClaimDistiller consistently has the best performance. Qualitative assessment on a small sample of out-of-domain data indicates that the model generalizes well. Our source codes and datasets can be found here: https://github.com/lamps-lab/sci-claim-distiller.

Comments

Link to proceedings landing page: https://ceur-ws.org/Vol-3451/

Rights

Use permitted under a Creative Commons License Attribution 4.0 International (CC BY 4.0) License.

Data Availability

Article states: Our source codes and datasets can be found here: https://github.com/lamps-lab/sci-claim-distiller.

Original Publication Citation

Wei, X., Hoque, M. R. U., Wu, J., & Li, J. (2023) ClaimDistiller: Scientific claim extraction with supervised contrastive learning. CEUR Workshop Proceedings: EEKE-All2023: Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2023) and AI + Informetrics (All2023): Proceedings of Joint Workshop of the 4th Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2023) and the 3rd AI + Informetrics (All2023) co-located with the JCDL 2023, 3451, 65-77. https://ceur-ws.org/Vol-3451/paper11.pdf

Repository Citation

Wei, X., Hoque, M. R. U., Wu, J., & Li, J. (2023) ClaimDistiller: Scientific claim extraction with supervised contrastive learning. CEUR Workshop Proceedings: EEKE-All2023: Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2023) and AI + Informetrics (All2023): Proceedings of Joint Workshop of the 4th Extraction and Evaluation of Knowledge Entities from Scientific Documents (EEKE2023) and the 3rd AI + Informetrics (All2023) co-located with the JCDL 2023, 3451, 65-77. https://ceur-ws.org/Vol-3451/paper11.pdf

ORCID

0000-0003-4055-2582 (Hoque), 0000-0003-0173-4463 (Wu), 0000-0003-0091-6986 (Li)

Computer Science Faculty Publications

ClaimDistiller: Scientific Claim Extraction with Supervised Contrastive Learning

Document Type

Publication Date

Publication Title

Volume

Pages

Conference Name

Abstract

Comments

Rights

Data Availability

Original Publication Citation

Repository Citation

ORCID

Included in

Search

Browse

Contribute

Links

Contact Us

Computer Science Faculty Publications

ClaimDistiller: Scientific Claim Extraction with Supervised Contrastive Learning

Authors

Document Type

Publication Date

Publication Title

Volume

Pages

Conference Name

Abstract

Comments

Rights

Data Availability

Original Publication Citation

Repository Citation

ORCID

Included in

Share

Search

Browse

Contribute

Links

Contact Us