Document Type
Conference Paper
Publication Date
2021
Publication Title
CEUR Workshop Proceedings
Volume
2936
Pages
125-132
Conference Name
CLEF 2021 - Conference and Labs of the Evaluation Forum, September 21-24, 2021, Bucharest, Romania
Abstract
This paper elaborates on our submission to the ARQMath track at CLEF 2021. For our submission this year we use a collection of methods to retrieve and re-rank the answers in Math Stack Exchange in addition to our two-stage model which was comparable to the best model last year in terms of NDCG’. We also provide a detailed analysis of what the transformers are learning and why is it hard to train a math language model using transformers. This year’s submission to Task-1 includes summarizing long question-answer pairs to augment and index documents, using byte-pair encoding to tokenize formula and then re-rank them, and finally important keywords extraction from posts. Using an ensemble of these methods our approach shows a 20% improvement than our ARQMath’2020 Task-1 submission.
Original Publication Citation
Rohatgi, S., Wu, J., & Giles, C. L. (2021) Ranked list fusion and re-ranking with pre-trained transformers for ARQMath lab. CEUR Workshop Proceedings, 2936, 125-162. http://ceur-ws.org/Vol-2936/paper-08.pdf
Repository Citation
Rohatgi, S., Wu, J., & Giles, C. L. (2021) Ranked list fusion and re-ranking with pre-trained transformers for ARQMath lab. CEUR Workshop Proceedings, 2936, 125-162. http://ceur-ws.org/Vol-2936/paper-08.pdf
ORCID
0000-0003-0173-4463 (Wu)
Included in
Databases and Information Systems Commons, Mathematics Commons, Numerical Analysis and Scientific Computing Commons
Comments
© 2021 The Authors.
Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).