26 February, 2024
README
file.README
FileREADME
at The Economic JournalMinimum Requirement
There should be a separation along:
Example?
.
βββ 20211107ext_2v1.do
βββ 20220120ext_2v1.do
βββ 20221101wave1.dta
βββ james
β βββ NLSY97
β βββ nlsy97_v2.do
βββ mary
β βββ NLSY97
β βββ nlsy97.do
βββ matlab_fortran
β βββ graphs
β βββ sensitivity1
β β βββ data.xlsx
β β βββ good_version.do
β β βββ script.m
β βββ sensitivity2
β βββ models.f90
β βββ models.mod
β βββ nrtype.f90
βββ readme.do
βββ scatter1.eps
βββ scatter1_1.eps
βββ scatter1_2.eps
βββ ts.eps
βββ wave1.dta
βββ wave2.dta
βββ wave2regs.dta
βββ wave2regs2.dta
(scroll down! π)
.
βββ README.md
βββ code
β βββ R
β β βββ 0-install.R
β β βββ 1-main.R
β β βββ 2-figure2.R
β β βββ 3-table2.R
β βββ stata
β β βββ 1-main.do
β β βββ 2-read_raw.do
β β βββ 3-figure1.do
β β βββ 4-figure3.do
β β βββ 5-table1.do
β βββ tex
β βββ appendix.tex
β βββ main.tex
βββ data
β βββ processed
β βββ raw
βββ output
βββ plots
βββ tables
README
Note
There is no unique best way to organize your project: Make it simple, intuitive and helpful.
Important
Ideally your entire project is under version control.
Question:
How to write reproducible code?
π Huge question to answer. Letβs try with a few simple things first:
No Manual Manipulation.
Do This!
In general, take all necessary steps to ensure cross-platform compatibility of your code.
file paths are such low-hanging fruit πβ¦
π Ask the user to set the root
of your project, via global variable, environment variable, or other
# in my R, I do
Sys.setenv(PACKAGE_ROOT="/Users/floswald/Downloads/your_package")
# your package uses:
file.path(Sys.getenv("PACKAGE_ROOT"), "data", "wages.csv")
# in my stata, I do
global PACKAGE_ROOT "/Users/floswald/Downloads/your_package"
# your package uses
use "$PACKAGE_ROOT/data/wages.dta"
Always use forward slashes on Stata /
, even on a windows machine!
No Guarantee
Your code will yield identical results on a different computer only if certain conditions apply.
Protected Environments
π You should provide a mechanism which ensures that those conditions do apply.
At a minimum, you list your exact computing environment:
OS, software and which version used (R 4.1
, stata 17/MP
, matlab 2023b
, GNU Fortran (Homebrew GCC 13.2.0)
)
Libraries and which exact version used (ggplot2 1.3.4
, outreg 2
, numpy 1.26.4
, boost 1.8.3
)
Stata: install all libraries into you replication package.
π Virtual Environments can help.
julia
built-in Pkg
manager:
(@v1.10) pkg> activate .
Activating new project at `~/CEPII`
(CEPII) pkg> add DataFrames GLM
# created 2 files in `~/CEPII`
# tracking all dependencies
Docker
π³ container. This provides a fully specified virtual machine (i.e. a dedicated computer for your project)
version xyz
statement in master script.ssc install somelib
will install an incompatible version a few years later.Note
Such mechanisms can reduce version conflicts amongst your dependencies. To the extent that all versions of those dependencies are still available, this guarantees a stable computing environment.
Output in Paper | Output in Package | Program to execute |
---|---|---|
Table 1 | outputs/tables/table1.tex |
code/table1.do |
Figure 1 | outputs/plots/figure1.pdf |
code/figure1.do |
Figure 2 | outputs/plots/figure2.pdf |
code/figure2.do |
outputs/
.