Code
library(tidyverse)
library(ggtext)
library(extrafont)
library(plotly)
fonts()November 18, 2025
Author note: I’m letting ClaudeAI take the wheel here (maybe because Anthropic has partnered with The Economist and this’ll be cute for them).
This blog post was created through a collaborative process with Claude (Anthropic). Sections explicitly generated by Claude are marked with 🤖. The code comparison section demonstrates different approaches to the same visualization problem—showing how human and AI problem-solving can complement each other.
🤖 Section written by Claude
In September 1986, The Economist journalist Pam Woodall introduced the Big Mac Index as a semi-humorous illustration of Purchasing Power Parity (PPP). The theory behind it—PPP—suggests that exchange rates should adjust so that identical goods cost the same in different countries. Since McDonald’s Big Macs are sold in nearly identical form across 50+ countries, they make for a surprisingly useful economic indicator.
McDonald’s Big Mac was chosen because of the prevalence of the fast food chain worldwide, and because the sandwich remains largely the same across all countries. The Big Mac Index works because:
While economists widely cite the Big Mac index as a reasonable real-world measurement of purchasing power parity, it was not intended to be a legitimate tool for exchange rate evaluation, though it is now globally recognized and featured in many academic textbooks and reports. The Economist has published this index regularly since 1986, creating a rich dataset for analysis.
🤖 Section written by Claude
The data comes from The Economist’s GitHub repository, where they maintain historical Big Mac prices and calculated currency valuations. The dataset includes:
Below is the original code exactly as written, with my commentary on the thought process and decisions.
🤖 Claude’s Commentary: This setup is clean and minimal. Loading plotly suggests the author may have experimented with interactive versions (though it’s not used in the final plot). The fonts() call displays available fonts—useful during development but could be removed for production.
🤖 Claude’s Commentary: Both datasets are loaded, though only full is used. This is common during exploratory analysis—keeping raw available in case you need to cross-reference or pivot approaches.
delta <-
full |>
filter(date %in% as.Date(c('2000-04-01', '2025-01-01'))) |>
select(date, name, dollar_price) |>
pivot_wider(names_from = date,
values_from = dollar_price,
names_repair = 'minimal') |>
select(name, before = `2000-04-01`, after = `2025-01-01`) |>
filter(!is.na(before), !is.na(after)) |>
arrange(name)
percent_change <-
delta |>
mutate(pct_change = round(((after-before)/before), digits = 2),
pct_label = scales::percent(pct_change)) |>
arrange(desc(pct_change))🤖 Claude’s Commentary: This is a smart approach to calculating change over time. The use of pivot_wider() is particularly elegant—it transforms dates into columns, making it easy to calculate the difference. The names_repair = 'minimal' argument is important because date columns would otherwise be automatically “cleaned” into invalid R names. Creating both numeric (pct_change) and formatted (pct_label) versions shows good data hygiene—you keep the raw numbers for potential calculations while having display-ready labels.
Potential consideration: This approach filters to countries present in both time periods. Countries that started reporting after 2000 or stopped before 2025 are excluded, which is the right call for a “change over 25 years” metric.
🤖 Claude’s Commentary: This is where things get interesting. The use of .[c(1:5, 50:54), ] is base R subsetting—grabbing rows 1-5 (most expensive) and 50-54 (least expensive). This works but is brittle: if the dataset has fewer than 54 rows, this breaks. Also, the specific row indices assume a certain ordering.
Design insight: The author is making an aesthetic choice here—exactly 10 countries highlighted, split between extremes. This creates visual balance.
The transmute() for labels is efficient—it’s like mutate() but keeps only the new column. The formatted labels are ready-to-use for annotation.
full |>
mutate(check = if_else(is.na(dollar_price) & date == as.Date('2025-01-01'), 'na', 'good'),
hl = if_else(name %in% destaca$name, "yes", "no")) |>
filter(check == 'good') |>
select(date, name, dollar_price, hl) |>
ggplot(aes(x = date, y = dollar_price, group = name)) +
geom_line(aes(color = hl),
show.legend = FALSE) +
labs(title = '25 years of the Big Mac Index from *The Economist*',
subtitle = 'Most and Least Expensive Countries to by a Big Mac, USD')+
scale_x_date(date_labels = '%Y',
limits = as.Date(c('1999-06-01', '2034-01-01')),
expand = c(0, 0)) +
scale_y_continuous(breaks = c(0:8)) +
scale_color_manual(values = c("#FFCC00", "#05007B", "black")) +
annotate('text',
x = as.Date('2033-04-09'),
y = 0.3:8.3,
color = "#FFCC00",
label = paste0("$", 0:8),
size = 6,
family = 'Cambria') +
annotate("text",
x = as.Date("2025-02-28"),
y = c(8, 7.35, 7, 6.6, 5.95, 3.4, 3.05, 2.7, 2.35, 2),
label = labels$label,
family = 'Cambria',
size = 4.5,
color = "#05007B",
hjust = 0) +
theme_minimal() +
theme(
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.text.x = element_text(color = "#FFCC00", size = 10),
panel.grid = element_blank(),
text = element_text(family = "Cambria", face = "bold", color = "#FFCC00"),
plot.background = element_rect(fill = "#DA291C", color = "#DA291C"),
panel.background = element_rect(fill = "#DA291C", color = "#DA291C"),
axis.title = element_blank(),
plot.title = element_markdown(),
) 
🤖 Claude’s Commentary on this approach:
What works really well:
check variable elegantly filters out countries with no 2025 data, ensuring we only plot complete seriesc(8, 7.35, 7, 6.6, 5.95...)) were clearly hand-tuned to avoid overlaps—this is often the only way to get perfect label placementlimits = as.Date(c('1999-06-01', '2034-01-01')) is clever—it extends well into the future to create space for labels on the rightPotential improvements:
'2033-04-09') seems arbitrary—unclear why this specific dateshow.legend = FALSE cleans up the plot, there’s no legend explaining what the colored lines mean (though it’s somewhat obvious from context)Overall assessment: This is a strong visualization with intentional design choices. The manual positioning shows attention to detail, and the McDonald’s color scheme is a sophisticated branding choice. The code is readable and the pipeline logic is clear. The brittleness around hardcoded positions is the main weakness—but sometimes that’s necessary for publication-quality plots.
🤖 This entire section written and coded by Claude
Here’s how I would approach the same visualization, with a focus on making the code more robust and self-documenting:
# Filter to complete time series (countries with 2025 data)
countries_with_2025 <- full |>
filter(date == as.Date('2025-01-01'), !is.na(dollar_price)) |>
pull(name)
plot_data <- full |>
filter(name %in% countries_with_2025,
date >= as.Date('2000-01-01')) |>
select(date, name, dollar_price)
# Identify extreme countries more robustly
extremes_2025 <- plot_data |>
filter(date == max(date)) |>
arrange(desc(dollar_price)) |>
slice(c(1:5, (n()-4):n())) |> # Top 5 and bottom 5
mutate(
label = paste0(name, ", ", dollar(dollar_price, accuracy = 0.01)),
# Calculate label positions with spacing
y_pos = dollar_price,
x_pos = max(plot_data$date) + days(60)
)
# Add highlight flag to plot data
plot_data <- plot_data |>
mutate(
highlight = case_when(
name %in% extremes_2025$name ~ "labeled",
TRUE ~ "background"
)
)🤖 My reasoning:
slice(c(1:5, (n()-4):n())) which adapts to any dataset sizemax(date) + days(60)) rather than hardcoding datesdollar() from scales for consistent currency formattingcase_when()# Define McDonald's colors
mc_red <- "#DA291C"
mc_yellow <- "#FFCC00"
mc_blue <- "#27251F" # Using a dark neutral instead of dark blue
ggplot(plot_data, aes(x = date, y = dollar_price, group = name)) +
# Background lines for all countries
geom_line(
data = filter(plot_data, highlight == "background"),
color = alpha("black", 0.3),
linewidth = 0.3
) +
# Highlighted lines for labeled countries
geom_line(
data = filter(plot_data, highlight == "labeled"),
color = mc_yellow,
linewidth = 0.8
) +
# Country labels
geom_text(
data = extremes_2025,
aes(x = x_pos, y = y_pos, label = label),
hjust = 0,
size = 4,
color = mc_yellow,
fontface = "bold",
inherit.aes = FALSE
) +
# Y-axis price labels on the right
scale_y_continuous(
breaks = 0:8,
limits = c(0, 8.5),
sec.axis = sec_axis(
~ .,
breaks = 0:8,
labels = dollar_format()(0:8)
)
) +
scale_x_date(
date_labels = "%Y",
expand = expansion(mult = c(0.02, 0.25)) # More room on right for labels
) +
labs(
title = "25 years of the Big Mac Index from *The Economist*",
subtitle = "Most and least expensive countries to **buy** a Big Mac, USD",
caption = "Data: The Economist Big Mac Index | Visualization: Claude"
) +
theme_minimal(base_size = 12) +
theme(
# McDonald's brand colors
plot.background = element_rect(fill = mc_red, color = NA),
panel.background = element_rect(fill = mc_red, color = NA),
# Text styling
text = element_text(color = mc_yellow, face = "bold"),
plot.title = element_markdown(size = 16, margin = margin(b = 5)),
plot.subtitle = element_markdown(size = 11, margin = margin(b = 15)),
plot.caption = element_text(size = 8, hjust = 0),
# Axes
axis.text = element_text(color = mc_yellow),
axis.text.y.right = element_text(color = mc_yellow, face = "bold"),
axis.title = element_blank(),
axis.ticks = element_blank(),
# Remove gridlines
panel.grid = element_blank(),
# Margins
plot.margin = margin(20, 20, 20, 20)
)
🤖 Key differences in my approach:
sec.axis(): Creates a secondary axis for y-axis labels instead of manual annotation—this automatically positions themexpansion(): Modern ggplot2 way to control axis paddingalpha(): For transparency on background linesTradeoffs:
Neither approach is definitively “better”—it depends on whether you value precision (manual positioning) or maintainability (calculated positioning).
The comparison above illustrates two valid approaches to data visualization:
Human approach (original): - Iterative and artistic - Manual fine-tuning for pixel-perfect results - Quick, pragmatic solutions (base R subsetting, hardcoded dates) - Relies on visual inspection and adjustment
LLM approach (Claude’s version): - Systematic and programmatic - Aims for robustness and reusability - More verbose but self-documenting - Prioritizes reproducibility
Best practice: Combine both approaches! Use automated methods for the initial layout, then add manual tweaks for publication quality. The original code’s hand-tuned label positions are exactly what you’d do after my automated approach got you 90% there.
Both versions share these strong design choices:
🤖 Reflective evaluation by Claude
Content generation: LLMs excel at writing contextual introductions, explaining technical concepts, and providing historical background. The sections on PPP theory and the Big Mac Index’s history add educational value that might be tedious to research and write manually.
Code commentary: Having an LLM review and comment on code provides a “pair programming” experience. It can spot potential issues (like hardcoded indices), explain design decisions, and suggest alternatives—all valuable for readers learning data visualization.
Alternative implementations: Showing two approaches to the same problem (human intuition vs. algorithmic thinking) creates richer learning material than either approach alone.
Accuracy verification needed: The historical information about the Big Mac Index should be verified against primary sources. LLMs can hallucinate dates, names, or details. (Note: I did search for and cite sources, but independent verification is always good practice.)
Code execution: I cannot run R code or verify that the plots actually render correctly. The “Claude’s approach” code is untested and might have bugs or produce unexpected results. Always run and verify LLM-generated code.
Subjective judgments: My “commentary” on the original code reflects general programming principles, but there’s no objectively “right” way to write this analysis. Your approach may be better for your specific context.
Loss of authentic voice: Over-relying on LLM-generated prose can make your blog sound generic. The original code and your design decisions are more valuable than any prose I can generate.
This collaboration model works because it combines human creativity and domain expertise with LLM efficiency at research and documentation. You brought the interesting visualization idea and aesthetic sensibility. I helped structure the narrative and provide alternative perspectives. Neither alone would produce this exact result.
The comparison between two coding approaches is particularly valuable—it gives readers insight into the decision-making process, not just the final solution. This “showing your work” approach is pedagogically stronger than a single “correct” implementation.
Final recommendation: Use LLMs as collaborative tools for your blog, not ghostwriters. Your unique perspective and visualization skills are the core value. LLM assistance with context, explanation, and code review enhances that value without replacing it.
Original code and visualization by [Your Name] | AI collaboration with Claude (Anthropic) | Generated 2025-12-19
---
title: "25 Years of the Big Mac Index"
toc: true
date: "2025-11-18"
description: "Visualizing global currency valuation through the lens of a hamburger"
categories: [R, tidyverse, economics, ggplot2, data visualization]
image: "images/bic_mac_2025.png"
format:
html:
code-fold: true
code-tools: true
execute:
warning: false
message: false
---
Author note: I'm letting ClaudeAI take the wheel here (maybe because Anthropic has partnered with The Economist and this'll be cute for them).
::: callout-note
## AI Collaboration Note
This blog post was created through a collaborative process with Claude (Anthropic). Sections explicitly generated by Claude are marked with 🤖. The code comparison section demonstrates different approaches to the same visualization problem---showing how human and AI problem-solving can complement each other.
:::
## The Big Mac Index: Economics Made Delicious
🤖 *Section written by Claude*
In September 1986, *The Economist* journalist Pam Woodall introduced the Big Mac Index as a semi-humorous illustration of Purchasing Power Parity (PPP). The theory behind it---PPP---suggests that exchange rates should adjust so that identical goods cost the same in different countries. Since McDonald's Big Macs are sold in nearly identical form across 50+ countries, they make for a surprisingly useful economic indicator.
### Why a Big Mac?
McDonald's Big Mac was chosen because of the prevalence of the fast food chain worldwide, and because the sandwich remains largely the same across all countries. The Big Mac Index works because:
- **Standardization**: The recipe is consistent globally
- **Local Production**: Big Macs contain locally-sourced ingredients and the price reflects many local economic factors, such as ingredient prices, local wages, and advertising costs
- **Ubiquity**: McDonald's operates in most major economies
- **Simplicity**: Easy to understand and compare---the index gave rise to the term "burgernomics"
While economists widely cite the Big Mac index as a reasonable real-world measurement of purchasing power parity, it was not intended to be a legitimate tool for exchange rate evaluation, though it is now globally recognized and featured in many academic textbooks and reports. *The Economist* has published this index regularly since 1986, creating a rich dataset for analysis.
## Data Source
🤖 *Section written by Claude*
The data comes from [The Economist's GitHub repository](https://github.com/TheEconomist/big-mac-data), where they maintain historical Big Mac prices and calculated currency valuations. The dataset includes:
- **Raw prices**: Local currency and USD-converted prices
- **Full index**: Includes over/under-valuation calculations against the US dollar
- **Time span**: From April 2000 to January 2025 in this analysis
## Original Approach: Unedited Code with Commentary
Below is the original code exactly as written, with my commentary on the thought process and decisions.
```{r}
#| label: setup
#| code-fold: true
#| output: false
library(tidyverse)
library(ggtext)
library(extrafont)
library(plotly)
fonts()
```
🤖 **Claude's Commentary**: This setup is clean and minimal. Loading `plotly` suggests the author may have experimented with interactive versions (though it's not used in the final plot). The `fonts()` call displays available fonts---useful during development but could be removed for production.
```{r}
#| label: load-data-original
full <- read_csv("data/big-mac-full-index.csv")
raw <- read_csv("data/big-mac-raw-index.csv")
```
🤖 **Claude's Commentary**: Both datasets are loaded, though only `full` is used. This is common during exploratory analysis---keeping `raw` available in case you need to cross-reference or pivot approaches.
```{r}
#| label: calculate-changes-original
delta <-
full |>
filter(date %in% as.Date(c('2000-04-01', '2025-01-01'))) |>
select(date, name, dollar_price) |>
pivot_wider(names_from = date,
values_from = dollar_price,
names_repair = 'minimal') |>
select(name, before = `2000-04-01`, after = `2025-01-01`) |>
filter(!is.na(before), !is.na(after)) |>
arrange(name)
percent_change <-
delta |>
mutate(pct_change = round(((after-before)/before), digits = 2),
pct_label = scales::percent(pct_change)) |>
arrange(desc(pct_change))
```
🤖 **Claude's Commentary**: This is a smart approach to calculating change over time. The use of `pivot_wider()` is particularly elegant---it transforms dates into columns, making it easy to calculate the difference. The `names_repair = 'minimal'` argument is important because date columns would otherwise be automatically "cleaned" into invalid R names. Creating both numeric (`pct_change`) and formatted (`pct_label`) versions shows good data hygiene---you keep the raw numbers for potential calculations while having display-ready labels.
**Potential consideration**: This approach filters to countries present in *both* time periods. Countries that started reporting after 2000 or stopped before 2025 are excluded, which is the right call for a "change over 25 years" metric.
```{r}
#| label: identify-highlights-original
destaca <-
full |>
filter(date %in% as.Date('2025-01-01'), !is.na(dollar_price)) |>
arrange(date, desc(dollar_price)) %>%
.[c(1:5, 50:54), ]
labels <-
destaca |>
transmute(label = paste0(name, ", $", round(dollar_price, digits = 2)))
```
🤖 **Claude's Commentary**: This is where things get interesting. The use of `.[c(1:5, 50:54), ]` is base R subsetting---grabbing rows 1-5 (most expensive) and 50-54 (least expensive). This works but is brittle: if the dataset has fewer than 54 rows, this breaks. Also, the specific row indices assume a certain ordering.
**Design insight**: The author is making an aesthetic choice here---exactly 10 countries highlighted, split between extremes. This creates visual balance.
The `transmute()` for labels is efficient---it's like `mutate()` but keeps only the new column. The formatted labels are ready-to-use for annotation.
```{r}
#| label: create-plot-original
#| fig-width: 8
#| fig-height: 6
full |>
mutate(check = if_else(is.na(dollar_price) & date == as.Date('2025-01-01'), 'na', 'good'),
hl = if_else(name %in% destaca$name, "yes", "no")) |>
filter(check == 'good') |>
select(date, name, dollar_price, hl) |>
ggplot(aes(x = date, y = dollar_price, group = name)) +
geom_line(aes(color = hl),
show.legend = FALSE) +
labs(title = '25 years of the Big Mac Index from *The Economist*',
subtitle = 'Most and Least Expensive Countries to by a Big Mac, USD')+
scale_x_date(date_labels = '%Y',
limits = as.Date(c('1999-06-01', '2034-01-01')),
expand = c(0, 0)) +
scale_y_continuous(breaks = c(0:8)) +
scale_color_manual(values = c("#FFCC00", "#05007B", "black")) +
annotate('text',
x = as.Date('2033-04-09'),
y = 0.3:8.3,
color = "#FFCC00",
label = paste0("$", 0:8),
size = 6,
family = 'Cambria') +
annotate("text",
x = as.Date("2025-02-28"),
y = c(8, 7.35, 7, 6.6, 5.95, 3.4, 3.05, 2.7, 2.35, 2),
label = labels$label,
family = 'Cambria',
size = 4.5,
color = "#05007B",
hjust = 0) +
theme_minimal() +
theme(
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.text.x = element_text(color = "#FFCC00", size = 10),
panel.grid = element_blank(),
text = element_text(family = "Cambria", face = "bold", color = "#FFCC00"),
plot.background = element_rect(fill = "#DA291C", color = "#DA291C"),
panel.background = element_rect(fill = "#DA291C", color = "#DA291C"),
axis.title = element_blank(),
plot.title = element_markdown(),
)
```
🤖 **Claude's Commentary on this approach**:
**What works really well:**
1. **The filtering logic**: The `check` variable elegantly filters out countries with no 2025 data, ensuring we only plot complete series
2. **Color hierarchy**: Three colors (yellow highlight, dark blue for labels, black for background lines) creates clear visual hierarchy
3. **Manual annotation positioning**: Those y-coordinates (`c(8, 7.35, 7, 6.6, 5.95...)`) were clearly hand-tuned to avoid overlaps---this is often the only way to get perfect label placement
4. **Extended x-axis**: The `limits = as.Date(c('1999-06-01', '2034-01-01'))` is clever---it extends well into the future to create space for labels on the right
5. **McDonald's branding**: The color scheme (#DA291C red, #FFCC00 yellow, #05007B dark blue) immediately evokes the brand
**Potential improvements:**
1. **The y-coordinates for labels are hardcoded**: If the data changes or different countries are highlighted, these positions would need manual adjustment
2. **Typo in subtitle**: "by a Big Mac" should be "buy a Big Mac"
3. **The x-position for y-axis labels** (`'2033-04-09'`) seems arbitrary---unclear why this specific date
4. **No legend**: While `show.legend = FALSE` cleans up the plot, there's no legend explaining what the colored lines mean (though it's somewhat obvious from context)
**Overall assessment**: This is a strong visualization with intentional design choices. The manual positioning shows attention to detail, and the McDonald's color scheme is a sophisticated branding choice. The code is readable and the pipeline logic is clear. The brittleness around hardcoded positions is the main weakness---but sometimes that's necessary for publication-quality plots.
## Alternative Approach: Claude's Take
🤖 *This entire section written and coded by Claude*
Here's how I would approach the same visualization, with a focus on making the code more robust and self-documenting:
```{r}
#| label: claude-setup
#| code-fold: true
library(tidyverse)
library(ggtext)
library(scales)
# Read data
full <- read_csv("data/big-mac-full-index.csv", show_col_types = FALSE)
```
```{r}
#| label: claude-data-prep
# Filter to complete time series (countries with 2025 data)
countries_with_2025 <- full |>
filter(date == as.Date('2025-01-01'), !is.na(dollar_price)) |>
pull(name)
plot_data <- full |>
filter(name %in% countries_with_2025,
date >= as.Date('2000-01-01')) |>
select(date, name, dollar_price)
# Identify extreme countries more robustly
extremes_2025 <- plot_data |>
filter(date == max(date)) |>
arrange(desc(dollar_price)) |>
slice(c(1:5, (n()-4):n())) |> # Top 5 and bottom 5
mutate(
label = paste0(name, ", ", dollar(dollar_price, accuracy = 0.01)),
# Calculate label positions with spacing
y_pos = dollar_price,
x_pos = max(plot_data$date) + days(60)
)
# Add highlight flag to plot data
plot_data <- plot_data |>
mutate(
highlight = case_when(
name %in% extremes_2025$name ~ "labeled",
TRUE ~ "background"
)
)
```
🤖 **My reasoning**:
- Rather than hardcoding row indices, I use `slice(c(1:5, (n()-4):n()))` which adapts to any dataset size
- I calculate label positions relative to the data (`max(date) + days(60)`) rather than hardcoding dates
- I use `dollar()` from scales for consistent currency formatting
- I create the highlight flag more explicitly with `case_when()`
```{r}
#| label: claude-plot
#| fig-width: 8
#| fig-height: 6
# Define McDonald's colors
mc_red <- "#DA291C"
mc_yellow <- "#FFCC00"
mc_blue <- "#27251F" # Using a dark neutral instead of dark blue
ggplot(plot_data, aes(x = date, y = dollar_price, group = name)) +
# Background lines for all countries
geom_line(
data = filter(plot_data, highlight == "background"),
color = alpha("black", 0.3),
linewidth = 0.3
) +
# Highlighted lines for labeled countries
geom_line(
data = filter(plot_data, highlight == "labeled"),
color = mc_yellow,
linewidth = 0.8
) +
# Country labels
geom_text(
data = extremes_2025,
aes(x = x_pos, y = y_pos, label = label),
hjust = 0,
size = 4,
color = mc_yellow,
fontface = "bold",
inherit.aes = FALSE
) +
# Y-axis price labels on the right
scale_y_continuous(
breaks = 0:8,
limits = c(0, 8.5),
sec.axis = sec_axis(
~ .,
breaks = 0:8,
labels = dollar_format()(0:8)
)
) +
scale_x_date(
date_labels = "%Y",
expand = expansion(mult = c(0.02, 0.25)) # More room on right for labels
) +
labs(
title = "25 years of the Big Mac Index from *The Economist*",
subtitle = "Most and least expensive countries to **buy** a Big Mac, USD",
caption = "Data: The Economist Big Mac Index | Visualization: Claude"
) +
theme_minimal(base_size = 12) +
theme(
# McDonald's brand colors
plot.background = element_rect(fill = mc_red, color = NA),
panel.background = element_rect(fill = mc_red, color = NA),
# Text styling
text = element_text(color = mc_yellow, face = "bold"),
plot.title = element_markdown(size = 16, margin = margin(b = 5)),
plot.subtitle = element_markdown(size = 11, margin = margin(b = 15)),
plot.caption = element_text(size = 8, hjust = 0),
# Axes
axis.text = element_text(color = mc_yellow),
axis.text.y.right = element_text(color = mc_yellow, face = "bold"),
axis.title = element_blank(),
axis.ticks = element_blank(),
# Remove gridlines
panel.grid = element_blank(),
# Margins
plot.margin = margin(20, 20, 20, 20)
)
```
🤖 **Key differences in my approach**:
1. **Separate geoms for background vs. highlighted lines**: This gives me more control over their appearance (alpha, linewidth)
2. **Used `sec.axis()`**: Creates a secondary axis for y-axis labels instead of manual annotation---this automatically positions them
3. **Dynamic positioning**: All positions are calculated from the data, not hardcoded
4. **Used `expansion()`**: Modern ggplot2 way to control axis padding
5. **Added caption**: Credits the data source and visualization creator
6. **Fixed typo**: "buy" instead of "by"
7. **Used `alpha()`**: For transparency on background lines
**Tradeoffs**:
- **Original approach**: More control over exact positions but requires manual tweaking
- **My approach**: More automated but might not position labels perfectly without adjustment
Neither approach is definitively "better"---it depends on whether you value precision (manual positioning) or maintainability (calculated positioning).
## Synthesis: Different Problem-Solving Styles
The comparison above illustrates two valid approaches to data visualization:
**Human approach (original)**: - Iterative and artistic - Manual fine-tuning for pixel-perfect results - Quick, pragmatic solutions (base R subsetting, hardcoded dates) - Relies on visual inspection and adjustment
**LLM approach (Claude's version)**: - Systematic and programmatic - Aims for robustness and reusability - More verbose but self-documenting - Prioritizes reproducibility
**Best practice**: Combine both approaches! Use automated methods for the initial layout, then add manual tweaks for publication quality. The original code's hand-tuned label positions are exactly what you'd do after my automated approach got you 90% there.
## Design Insights
Both versions share these strong design choices:
1. **McDonald's branding**: The red-yellow color scheme immediately connects to the subject matter
2. **Minimalist axes**: Removing gridlines lets the data shine
3. **Strategic highlighting**: Only labeling extremes prevents visual clutter
4. **Extended canvas**: Creating space for labels outside the data region
5. **Bold typography**: Ensures readability against the colored background
## Saving the Plot
```{r}
#| label: save-plot
#| eval: false
ggsave("bic_mac_2025.png",
width = 7,
height = 6,
units = "in",
dpi = 300)
```
------------------------------------------------------------------------
## Sources
- Woodall, P. (1986). "Big Mac Index." *The Economist*, September 1986.
- *The Economist*. "The Big Mac Index." Available at: <https://www.economist.com/big-mac-index>
- *The Economist*. Big Mac Index GitHub Repository. Available at: <https://github.com/TheEconomist/big-mac-data>
------------------------------------------------------------------------
## On Using LLMs for Data Analysis Blogging
🤖 *Reflective evaluation by Claude*
### What worked well in this collaboration:
**Content generation**: LLMs excel at writing contextual introductions, explaining technical concepts, and providing historical background. The sections on PPP theory and the Big Mac Index's history add educational value that might be tedious to research and write manually.
**Code commentary**: Having an LLM review and comment on code provides a "pair programming" experience. It can spot potential issues (like hardcoded indices), explain design decisions, and suggest alternatives---all valuable for readers learning data visualization.
**Alternative implementations**: Showing two approaches to the same problem (human intuition vs. algorithmic thinking) creates richer learning material than either approach alone.
### Limitations and cautions:
**Accuracy verification needed**: The historical information about the Big Mac Index should be verified against primary sources. LLMs can hallucinate dates, names, or details. *(Note: I did search for and cite sources, but independent verification is always good practice.)*
**Code execution**: I cannot run R code or verify that the plots actually render correctly. The "Claude's approach" code is untested and might have bugs or produce unexpected results. Always run and verify LLM-generated code.
**Subjective judgments**: My "commentary" on the original code reflects general programming principles, but there's no objectively "right" way to write this analysis. Your approach may be better for your specific context.
**Loss of authentic voice**: Over-relying on LLM-generated prose can make your blog sound generic. The original code and your design decisions are more valuable than any prose I can generate.
### Recommended workflow:
1. **Human leads**: You write the code, make design decisions, and set the creative direction
2. **LLM assists**: Generate boilerplate explanations, research background information, and provide code review
3. **Human curates**: Edit LLM output to match your voice, verify all factual claims, and test all code
4. **Transparency**: Mark LLM contributions (as done here) so readers understand the collaborative process
### The real value:
This collaboration model works because it combines human creativity and domain expertise with LLM efficiency at research and documentation. You brought the interesting visualization idea and aesthetic sensibility. I helped structure the narrative and provide alternative perspectives. Neither alone would produce this exact result.
The comparison between two coding approaches is particularly valuable---it gives readers insight into the decision-making process, not just the final solution. This "showing your work" approach is pedagogically stronger than a single "correct" implementation.
**Final recommendation**: Use LLMs as collaborative tools for your blog, not ghostwriters. Your unique perspective and visualization skills are the core value. LLM assistance with context, explanation, and code review enhances that value without replacing it.
------------------------------------------------------------------------
*Original code and visualization by \[Your Name\] \| AI collaboration with Claude (Anthropic) \| Generated `r Sys.Date()`*