Sample Analysis Template

Overview

This template demonstrates a reproducible analysis flow that feeds multiple deliverables (manuscripts, summaries, presentations) while keeping heavy lifting in one place.

Reusable functions live in analysis/pipeline.R.
The analysis document is cached and frozen so re-renders only recompute when code changes.
Artifacts (tables, figures, workspace snapshots) land in artifacts/ for other .qmd files to consume.
Metadata lists artifact paths so downstream documents can reference them without guessing filenames.

Load and Prepare Data

Code

library(dplyr)
library(ggplot2)
library(readr)

mpg_tbl <- ggplot2::mpg %>%
  mutate(across(where(is.character), as.factor))

head(mpg_tbl)

# A tibble: 6 × 11
  manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
  <fct>        <fct> <dbl> <int> <int> <fct>      <fct> <int> <int> <fct> <fct> 
1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa…
2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa…
3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa…
4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa…
5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa…
6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa…

Summaries with the Pipeline

Code

mpg_summary <- create_summary_table(mpg_tbl, class, hwy)
mpg_summary

# A tibble: 7 × 2
  class      mean_value
  <fct>           <dbl>
1 2seater          24.8
2 compact          28.3
3 midsize          27.3
4 minivan          22.4
5 pickup           16.9
6 subcompact       28.1
7 suv              18.1

Code

summary_rds <- write_artifact(
  mpg_summary,
  file.path("artifacts", "tables", "mpg-summary.rds")
)

summary_csv <- file.path("artifacts", "tables", "mpg-summary.csv")
ensure_parent_dir(summary_csv)

[1] "artifacts/tables/mpg-summary.csv"

Code

readr::write_csv(mpg_summary, summary_csv)

list(rds = summary_rds, csv = summary_csv)

$rds
[1] "artifacts/tables/mpg-summary.rds"

$csv
[1] "artifacts/tables/mpg-summary.csv"

Visualize Trends

Code

mpg_trend <- plot_trend(
  mpg_tbl %>% group_by(year) %>% summarise(hwy = mean(hwy, na.rm = TRUE)),
  year,
  hwy
)

mpg_trend

Code

plot_path <- file.path("artifacts", "figures", "mpg-trend.png")
ensure_parent_dir(plot_path)

[1] "artifacts/figures/mpg-trend.png"

Code

ggsave(plot_path, plot = mpg_trend, width = 7, height = 4.5, dpi = 300)
plot_path

[1] "artifacts/figures/mpg-trend.png"

Share Artifacts Across Languages

Code

import pandas as pd
summary = pd.read_csv("artifacts/tables/mpg-summary.csv")
summary.head()

Code

import json
from pathlib import Path

manifest = {
    "tables": {
        "mpg_summary_csv": "artifacts/tables/mpg-summary.csv",
        "mpg_summary_rds": "artifacts/tables/mpg-summary.rds"
    },
    "figures": {
        "mpg_trend_png": "artifacts/figures/mpg-trend.png"
    },
    "workspaces": {
        "rdata_snapshot": "artifacts/workspaces/sample-analysis.RData"
    }
}

manifest_path = Path("artifacts/resources.json")
manifest_path.write_text(json.dumps(manifest, indent=2))
manifest_path.as_posix()

Workspace Snapshots

Code

workspace_path <- file.path("artifacts", "workspaces", "sample-analysis.RData")
ensure_parent_dir(workspace_path)

[1] "artifacts/workspaces/sample-analysis.RData"

Code

save(list = c("mpg_tbl", "mpg_summary", "mpg_trend"), file = workspace_path)
workspace_path

[1] "artifacts/workspaces/sample-analysis.RData"

Downstream documents can load("../artifacts/workspaces/sample-analysis.RData") to reuse objects without re-running chunks.

Next Deliverable Steps

Render this analysis (quarto render analysis/sample.qmd) to refresh artifacts.
In a manuscript or summary .qmd, load analysis/pipeline.R or the saved workspace, then pull in artifacts with read_artifact() or language-specific loaders.
Reference the metadata entries in this document (see YAML metadata.rsrc) to keep paths consistent across deliverables.

Tip

Consider a task pipeline (targets, drake, or quarto render --profile) if analyses branch into multiple parameter sets or data refreshes.

References

As you cite literature (e.g., @wickham2019tidyverse for tidyverse workflows or @quarto2024guide for authoring guidance), the bibliography below is populated automatically.