26 February, 2024
README file.README FileREADME at The Economic JournalMinimum Requirement
There should be a separation along:
Example?
.
βββ 20211107ext_2v1.do
βββ 20220120ext_2v1.do
βββ 20221101wave1.dta
βββ james
β βββ NLSY97
β βββ nlsy97_v2.do
βββ mary
β βββ NLSY97
β βββ nlsy97.do
βββ matlab_fortran
β βββ graphs
β βββ sensitivity1
β β βββ data.xlsx
β β βββ good_version.do
β β βββ script.m
β βββ sensitivity2
β βββ models.f90
β βββ models.mod
β βββ nrtype.f90
βββ readme.do
βββ scatter1.eps
βββ scatter1_1.eps
βββ scatter1_2.eps
βββ ts.eps
βββ wave1.dta
βββ wave2.dta
βββ wave2regs.dta
βββ wave2regs2.dta
(scroll down! π)
.
βββ README.md
βββ code
β βββ R
β β βββ 0-install.R
β β βββ 1-main.R
β β βββ 2-figure2.R
β β βββ 3-table2.R
β βββ stata
β β βββ 1-main.do
β β βββ 2-read_raw.do
β β βββ 3-figure1.do
β β βββ 4-figure3.do
β β βββ 5-table1.do
β βββ tex
β βββ appendix.tex
β βββ main.tex
βββ data
β βββ processed
β βββ raw
βββ output
βββ plots
βββ tables
READMENote
There is no unique best way to organize your project: Make it simple, intuitive and helpful.
Important
Ideally your entire project is under version control.
Question:
How to write reproducible code?
π Huge question to answer. Letβs try with a few simple things first:
No Manual Manipulation.
Do This!
In general, take all necessary steps to ensure cross-platform compatibility of your code.
file paths are such low-hanging fruit πβ¦
π Ask the user to set the root of your project, via global variable, environment variable, or other
# in my R, I do
Sys.setenv(PACKAGE_ROOT="/Users/floswald/Downloads/your_package")
# your package uses:
file.path(Sys.getenv("PACKAGE_ROOT"), "data", "wages.csv")# in my stata, I do
global PACKAGE_ROOT "/Users/floswald/Downloads/your_package"
# your package uses
use "$PACKAGE_ROOT/data/wages.dta"Always use forward slashes on Stata /, even on a windows machine!
No Guarantee
Your code will yield identical results on a different computer only if certain conditions apply.
Protected Environments
π You should provide a mechanism which ensures that those conditions do apply.
At a minimum, you list your exact computing environment:
OS, software and which version used (R 4.1, stata 17/MP, matlab 2023b, GNU Fortran (Homebrew GCC 13.2.0))
Libraries and which exact version used (ggplot2 1.3.4, outreg 2, numpy 1.26.4, boost 1.8.3 )
Stata: install all libraries into you replication package.
π Virtual Environments can help.
julia built-in Pkg manager:
(@v1.10) pkg> activate .
Activating new project at `~/CEPII`
(CEPII) pkg> add DataFrames GLM
# created 2 files in `~/CEPII`
# tracking all dependenciesDocker π³ container. This provides a fully specified virtual machine (i.e. a dedicated computer for your project)
version xyz statement in master script.ssc install somelib will install an incompatible version a few years later.Note
Such mechanisms can reduce version conflicts amongst your dependencies. To the extent that all versions of those dependencies are still available, this guarantees a stable computing environment.
| Output in Paper | Output in Package | Program to execute |
|---|---|---|
| Table 1 | outputs/tables/table1.tex |
code/table1.do |
| Figure 1 | outputs/plots/figure1.pdf |
code/figure1.do |
| Figure 2 | outputs/plots/figure2.pdf |
code/figure2.do |
outputs/.