Sample Analysis Template

Overview

This template demonstrates a reproducible analysis flow that feeds multiple deliverables (manuscripts, summaries, presentations) while keeping heavy lifting in one place.

  • Reusable functions live in analysis/pipeline.R.
  • The analysis document is cached and frozen so re-renders only recompute when code changes.
  • Artifacts (tables, figures, workspace snapshots) land in artifacts/ for other .qmd files to consume.
  • Metadata lists artifact paths so downstream documents can reference them without guessing filenames.

Load and Prepare Data

Code
library(dplyr)
library(ggplot2)
library(readr)

mpg_tbl <- ggplot2::mpg %>%
  mutate(across(where(is.character), as.factor))

head(mpg_tbl)
# A tibble: 6 × 11
  manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
  <fct>        <fct> <dbl> <int> <int> <fct>      <fct> <int> <int> <fct> <fct> 
1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa…
2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa…
3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa…
4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa…
5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa…
6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa…

Summaries with the Pipeline

Code
mpg_summary <- create_summary_table(mpg_tbl, class, hwy)
mpg_summary
# A tibble: 7 × 2
  class      mean_value
  <fct>           <dbl>
1 2seater          24.8
2 compact          28.3
3 midsize          27.3
4 minivan          22.4
5 pickup           16.9
6 subcompact       28.1
7 suv              18.1
Code
summary_rds <- write_artifact(
  mpg_summary,
  file.path("artifacts", "tables", "mpg-summary.rds")
)

summary_csv <- file.path("artifacts", "tables", "mpg-summary.csv")
ensure_parent_dir(summary_csv)
[1] "artifacts/tables/mpg-summary.csv"
Code
readr::write_csv(mpg_summary, summary_csv)

list(rds = summary_rds, csv = summary_csv)
$rds
[1] "artifacts/tables/mpg-summary.rds"

$csv
[1] "artifacts/tables/mpg-summary.csv"

Share Artifacts Across Languages

Code
import pandas as pd
summary = pd.read_csv("artifacts/tables/mpg-summary.csv")
summary.head()
Code
import json
from pathlib import Path

manifest = {
    "tables": {
        "mpg_summary_csv": "artifacts/tables/mpg-summary.csv",
        "mpg_summary_rds": "artifacts/tables/mpg-summary.rds"
    },
    "figures": {
        "mpg_trend_png": "artifacts/figures/mpg-trend.png"
    },
    "workspaces": {
        "rdata_snapshot": "artifacts/workspaces/sample-analysis.RData"
    }
}

manifest_path = Path("artifacts/resources.json")
manifest_path.write_text(json.dumps(manifest, indent=2))
manifest_path.as_posix()

Workspace Snapshots

Code
workspace_path <- file.path("artifacts", "workspaces", "sample-analysis.RData")
ensure_parent_dir(workspace_path)
[1] "artifacts/workspaces/sample-analysis.RData"
Code
save(list = c("mpg_tbl", "mpg_summary", "mpg_trend"), file = workspace_path)
workspace_path
[1] "artifacts/workspaces/sample-analysis.RData"

Downstream documents can load("../artifacts/workspaces/sample-analysis.RData") to reuse objects without re-running chunks.

Next Deliverable Steps

  1. Render this analysis (quarto render analysis/sample.qmd) to refresh artifacts.
  2. In a manuscript or summary .qmd, load analysis/pipeline.R or the saved workspace, then pull in artifacts with read_artifact() or language-specific loaders.
  3. Reference the metadata entries in this document (see YAML metadata.rsrc) to keep paths consistent across deliverables.
Tip

Consider a task pipeline (targets, drake, or quarto render --profile) if analyses branch into multiple parameter sets or data refreshes.

References

As you cite literature (e.g., @wickham2019tidyverse for tidyverse workflows or @quarto2024guide for authoring guidance), the bibliography below is populated automatically.