Document Type
Conference Paper
Publication Date
2020
DOI
10.18653/v1/2020.sdp-1.3
Publication Title
Proceedings of the First Workshop on Scholarly Document Publishing
Pages
10-19
Conference Name
First Workshop on Scholarly Document Processing, November 19, 2020, Online
Abstract
Acknowledgements are ubiquitous in scholarly papers. Existing acknowledgement entity recognition methods assume all named entities are acknowledged. Here, we examine the nuances between acknowledged and named entities by analyzing sentence structure. We develop an acknowledgement extraction system, AckExtract based on open-source text mining software and evaluate our method using manually labeled data. AckExtract uses the PDF of a scholarly paper as input and outputs acknowledgement entities. Results show an overall performance of F1=0.92. We built a supplementary database by linking CORD-19 papers with acknowledgement entities extracted by AckExtract including persons and organizations and find that only up to 50–60% of named entities are actually acknowledged. We further analyze chronological trends of acknowledgement entities in CORD-19 papers. All codes and labeled data are publicly available at https://github.com/lamps-lab/ackextract.
Rights
© 2020 Association for Computational Linguistics
Materials published in or after 2016 are licensed on a Creative Commons Attribution 4.0 International (CC BY 4.0) License.
Original Publication Citation
Wu, J., Wang, P., Wei, X., Rajtmajer, S., Giles, C. L., & Griffin, C. (2020). Acknowledgement entity recognition in CORD-19 papers. In Proceedings of the First Workshop on Scholarly Document Processing (pp. 10-19). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.sdp-1.3
Repository Citation
Wu, J., Wang, P., Wei, X., Rajtmajer, S., Giles, C. L., & Griffin, C. (2020). Acknowledgement entity recognition in CORD-19 papers. In Proceedings of the First Workshop on Scholarly Document Processing (pp. 10-19). Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.sdp-1.3
ORCID
0000-0003-0173-4463 (Wu)
Included in
Cataloging and Metadata Commons, Numerical Analysis and Scientific Computing Commons, Scholarly Publishing Commons