Document Type

Conference Paper

Publication Date




Publication Title

WWW '21: Companion Proceedings of the Web Conference 2021



Conference Name

International World Wide Web Conference 2021 WWW 2021, 19-23 April, 2021, Ljubljana, Slovenia


With substantial and continuing increases in the number of published papers across the scientific literature, development of reliable approaches for automated discovery and assessment of published findings is increasingly urgent. Tools which can extract critical information from scientific papers and metadata can support representation and reasoning over existing findings, and offer insights into replicability, robustness and generalizability of specific claims. In this work, we present a pipeline for the extraction of statistical information (p-values, sample size, number of hypotheses tested) from full-text scientific documents. We validate our approach on 300 papers selected from the social and behavioral science literatures, and suggest directions for next steps.


Published in WWW2021 Companion © 2021 International World Wide Web Conference Committee, published under Creative Commons Attribution 4.0 (CC BY 4.0) License.

Original Publication Citation

Lanka, S. S. T., Rajtmajer, S., Wu, J., & Giles, C. L. (2021). Extraction and evaluation of statistical information from social and behavioral science papers. In Leskovec, J., Grobelnik, M. Najork, M., Tang, J, & Leila, Z., WWW '21: Companion Proceedings of the Web Conference 2021. (pp. 426-430). Association for Computing Machinery.


0000-0003-0173-4463 (Wu)