Text and data mining scientific articles with allofplos
Elizabeth Seiver
M Pacer
Sebastian Bassi
Mining scientific articles is hard when many of them are inaccessible
behind paywalls. The Public Library of Science (PLOS) is a non-profit
Open Access science publisher of the single largest journal (PLOS
ONE), whose articles are all freely available to read and re-use.
allofplos is a Python package for maintaining a constantly growing
collection of PLOS's 230,000+ articles. It also efficiently
parses these article files into Python data structures. This article will
cover how allofplos keeps your articles up-to-date, and how to use it to
easily access common article metadata and fuel your meta-research, with
actual use cases from inside PLOS.
Text and data mining, metascience, open access, science publishing, scientific articles, XML
DOI10.25080/Majora-4af1f417-009