lpEdit: an editor to facilitate reproducible analysis via literate programming
Adam J Richards
Andrzej S. Kosinski
Camille Bonneaud
Delphine Legrand
Kouros Owzar
Abstract
There is evidence to suggest that a surprising proportion
of published experiments in science are difficult if not impossible to
reproduce. The concepts of data sharing, leaving an audit trail and
extensive documentation are fundamental to reproducible research,
whether it is in the laboratory or as part of an analysis. In this
work, we introduce a tool for documentation that aims to make analyses
more reproducible in the general scientific community.
The application, lpEdit, is a cross-platform editor, written with PyQt4,
that enables a broad range of scientists to carry out the analytic
component of their work in a reproducible manner—through the use of
literate programming. Literate programming mixes code and prose to
produce a final report that reads like an article or book. lpEdit
targets researchers getting started with statistics or programming, so
the hurdles associated with setting up a proper pipeline are kept to a
minimum and the learning burden is reduced through the use of
templates and documentation. The documentation for lpEdit is centered
around learning by example, and accordingly we use several
increasingly involved examples to demonstrate the software’s
capabilities.
We first consider applications of lpEdit to process analyses mixing
R and Python code with the documentation
system. Finally, we illustrate the use of lpEdit to conduct a
reproducible functional analysis of high-throughput sequencing
data, using the transcriptome of the butterfly species Pieris
brassicae.
reproducible research, text editor, RNA-seq
DOI10.25080/Majora-8b375195-00e