Tidy Tuesday: The Geography of Norwegian Fish Mortality

tidytuesday
R
aquaculture
norway
ggridges
Where are Norway’s farmed salmon dying? A regional deep-dive into six years of aquaculture mortality data from the Norwegian Veterinary Institute.
Author

Sean Thimons

Published

March 21, 2026

Preface

From TidyTuesday repository.

This week’s data comes from the Norwegian Veterinary Institute’s salmonid mortality datasets. The Norwegian government aims to reduce fish farming mortality, making this data relevant to public health policy. Two datasets are provided: monthly loss counts (dead, discarded, escaped, other) and monthly mortality rate summaries (median, Q1, Q3) across regions and species.

Suggested questions:

  • How does monthly mortality vary over the available time period?
  • Which regions demonstrate the lowest mortality rates?
  • Beyond fish deaths, what other loss categories significantly impact operations?

Loading necessary packages

My handy booster pack that allows me to install (if needed) and load my usual and favorite packages, as well as some helpful functions.

Code
# Packages ----------------------------------------------------------------

{
  # Install pak if it's not already installed
  if (!requireNamespace("pak", quietly = TRUE)) {
    install.packages(
      "pak",
      repos = sprintf(
        "https://r-lib.github.io/p/pak/stable/%s/%s/%s",
        .Platform$pkgType,
        R.Version()$os,
        R.Version()$arch
      )
    )
  }

  # CRAN Packages ----
  install_booster_pack <- function(package, load = TRUE) {
    for (pkg in package) {
      if (!requireNamespace(pkg, quietly = TRUE)) {
        pak::pkg_install(pkg)
      }
      if (load) {
        library(pkg, character.only = TRUE)
      }
    }
  }

  booster_pack <- c(
    ### IO ----
    'fs',
    'here',
    'janitor',
    'rio',
    'tidyverse',

    ### EDA ----
    'skimr',

    ### Plot ----
    'paletteer',           # Color palette collection
    'patchwork',           # Multi-panel layouts
    'ggridges',            # Ridge plots
    'ggtext',              # Rich text in ggplot
    'ggrepel',             # Non-overlapping labels

    ### Misc ----
    'tidytuesdayR'
  )

  install_booster_pack(package = booster_pack, load = TRUE)
  rm(install_booster_pack, booster_pack)

  # Custom Functions ----

  `%ni%` <- Negate(`%in%`)

  geometric_mean <- function(x) {
    exp(mean(log(x[x > 0]), na.rm = TRUE))
  }

  my_skim <- skim_with(
    numeric = sfl(
      n = length,
      min = ~ min(.x, na.rm = T),
      p25 = ~ stats::quantile(., probs = .25, na.rm = TRUE, names = FALSE),
      med = ~ median(.x, na.rm = T),
      p75 = ~ stats::quantile(., probs = .75, na.rm = TRUE, names = FALSE),
      max = ~ max(.x, na.rm = T),
      mean = ~ mean(.x, na.rm = T),
      geo_mean = ~ geometric_mean(.x),
      sd = ~ stats::sd(., na.rm = TRUE),
      hist = ~ inline_hist(., 5)
    ),
    append = FALSE
  )
}

Load raw data from package

raw <- tidytuesdayR::tt_load('2026-03-17')

losses <- raw$monthly_losses_data
mortality <- raw$monthly_mortality_data

Exploratory Data Analysis

The my_skim() function is a modified version of the skimr::skim() function that returns the number of missing data points (cells as NA) as well as the inverse (e.g.: number of rows that are not NA), the count, minimum, 25%, median, 75%, max, mean, geometric mean, and standard deviation. It also generates a little ASCII histogram. Neat!

Monthly Losses

losses <- losses %>%
  mutate(date = as.Date(date))

cat(sprintf("monthly_losses_data: %d rows, %d cols\n", nrow(losses), ncol(losses)))
monthly_losses_data: 2808 rows, 9 cols
cat(sprintf("Date range: %s to %s\n", min(losses$date), max(losses$date)))
Date range: 2020-01-01 to 2025-12-01
cat(sprintf("Species: %s\n", paste(unique(losses$species), collapse = ", ")))
Species: salmon, rainbowtrout
cat(sprintf("Geo groups: %s\n", paste(unique(losses$geo_group), collapse = ", ")))
Geo groups: area, county, country
losses %>%
  select(losses, dead, discarded, escaped, other) %>%
  my_skim()
Data summary
Name Piped data
Number of rows 2808
Number of columns 5
_______________________
Column type frequency:
numeric 5
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate n min p25 med p75 max mean geo_mean sd hist
losses 0 1 2808 0 2158.25 202551.0 520828.75 9253842 443774.15 120847.93 922958.83 ▇▁▁▁▁
dead 0 1 2808 0 1839.25 176452.0 434446.25 7947877 378946.48 104454.43 786700.43 ▇▁▁▁▁
discarded 0 1 2808 0 0.00 5936.5 20301.50 442950 21374.84 10833.97 47902.38 ▇▁▁▁▁
escaped 0 1 2808 0 0.00 0.0 0.00 38638 154.71 37.69 1922.66 ▇▁▁▁▁
other 0 1 2808 0 0.00 528.0 10749.25 3130475 43298.13 3606.53 176226.26 ▇▁▁▁▁

The losses data spans January 2020 through December 2025 — six full years of monthly records. Two species are tracked: Atlantic salmon and rainbow trout. Data is reported at three geographic levels: numbered production areas (1–13), named counties, and a country-level aggregate (“Norge”).

The escaped column is heavily right-skewed with a median near zero but occasional massive spikes — these represent dramatic escape events from damaged net pens. The dead column dominates the losses total, accounting for ~85% of all recorded losses.

Monthly Mortality Rates

mortality <- mortality %>%
  mutate(date = as.Date(date))

cat(sprintf("monthly_mortality_data: %d rows, %d cols\n", nrow(mortality), ncol(mortality)))
monthly_mortality_data: 1788 rows, 7 cols
cat(sprintf("Species: %s\n", paste(unique(mortality$species), collapse = ", ")))
Species: rainbowtrout, salmon
mortality %>%
  select(median, q1, q3) %>%
  my_skim()
Data summary
Name Piped data
Number of rows 1788
Number of columns 3
_______________________
Column type frequency:
numeric 3
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate n min p25 med p75 max mean geo_mean sd hist
median 0 1 1788 0.14 0.45 0.59 0.77 2.89 0.64 0.59 0.27 ▇▃▁▁▁
q1 0 1 1788 0.08 0.22 0.29 0.38 1.72 0.32 0.29 0.15 ▇▂▁▁▁
q3 0 1 1788 0.24 0.88 1.22 1.64 11.90 1.34 1.20 0.73 ▇▁▁▁▁

The mortality dataset reports distributional summaries (median, Q1, Q3) of monthly mortality rates across farms within each region. This captures the spread of farm-level performance, not just the aggregate — a region’s median might look acceptable while its Q3 reveals a long tail of struggling farms.

The Geography of Mortality

Norway’s aquaculture industry stretches along its western and northern coastline, from Agder in the south to Finnmark in the Arctic. But the fish don’t die equally everywhere.

Regional mortality rankings

# Focus on salmon at the county level
county_mortality <- mortality %>%
  filter(geo_group == "county", species == "salmon") %>%
  mutate(year = year(date))

# Verify county names
cat("Counties in mortality data:\n")
Counties in mortality data:
print(sort(unique(county_mortality$region)))
[1] "Agder & Rogaland" "Finnmark"         "Møre og Romsdal"  "Nordland"        
[5] "Troms"            "Trøndelag"        "Vestland"        
# Average mortality by county
county_avg <- county_mortality %>%
  group_by(region) %>%
  summarise(
    avg_median_mort = mean(median, na.rm = TRUE),
    avg_q1 = mean(q1, na.rm = TRUE),
    avg_q3 = mean(q3, na.rm = TRUE),
    spread = mean(q3 - q1, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  arrange(desc(avg_median_mort))

cat(sprintf("\nCounty mortality ranking: %d counties\n", nrow(county_avg)))

County mortality ranking: 7 counties
stopifnot("No county data found" = nrow(county_avg) > 0)

county_avg
# A tibble: 7 × 5
  region           avg_median_mort avg_q1 avg_q3 spread
  <chr>                      <dbl>  <dbl>  <dbl>  <dbl>
1 Vestland                   0.796  0.346  2.07   1.73 
2 Agder & Rogaland           0.716  0.312  1.66   1.35 
3 Møre og Romsdal            0.706  0.337  1.46   1.13 
4 Trøndelag                  0.661  0.309  1.32   1.01 
5 Finnmark                   0.550  0.286  1.09   0.804
6 Troms                      0.487  0.255  0.967  0.711
7 Nordland                   0.427  0.220  0.847  0.627
ImportantRegional note

The mortality dataset uses combined regions for some counties (e.g., “Agder & Rogaland”) that differ slightly from the losses dataset’s individual county reporting. This is because mortality rates are computed per-farm within production zones, while absolute losses are tallied per administrative county.

Mortality distributions over time

Let’s look at how the distribution of mortality rates across farms has shifted year-by-year in each county. This is where ridgeline plots shine — they reveal whether the entire distribution is shifting, or whether improvements are concentrated at the top or bottom.

# Reshape mortality data: each row gives us median, q1, q3 for a county-month
# We'll use the median values for ridgelines across months, grouped by year and county

county_monthly <- mortality %>%
  filter(geo_group == "county", species == "salmon") %>%
  mutate(
    year = factor(year(date)),
    month = month(date)
  )

cat(sprintf("county_monthly: %d rows, %d cols\n", nrow(county_monthly), ncol(county_monthly)))
county_monthly: 504 rows, 9 cols
stopifnot("county_monthly is empty" = nrow(county_monthly) > 0)

# Order counties by overall median mortality (highest at top)
county_order <- county_monthly %>%
  group_by(region) %>%
  summarise(avg_mort = mean(median, na.rm = TRUE), .groups = "drop") %>%
  arrange(avg_mort) %>%
  pull(region)

county_monthly <- county_monthly %>%
  mutate(region = factor(region, levels = county_order))

# Palette check
palette_log <- read.csv(here::here("posts", "palette-log.csv"))
cat("Already used palettes:\n")
Already used palettes:
print(palette_log$palette)
 [1] "hardcoded (red/blue binary)"     "hardcoded (clinical_palette)"   
 [3] "default_jco"                     "hardcoded (outcome_colors)"     
 [5] "hardcoded (franchise colors)"    "hardcoded (palette_palms)"      
 [7] "hardcoded (Amazon brand colors)" "hardcoded (inline red/blue)"    
 [9] "hardcoded (Olympic gradient)"    "hardcoded (city colors)"        
[11] "Hiroshige"                       "Starfish"                       
[13] "vik"                             "Juarez"                         
[15] "Zissou1"                         "Vivid"                          
[17] "Alacena"                         "lajolla"                        
[19] "berlin"                          "Redon"                          
[21] "milkmaid"                        "Bold"                           
[23] "PonyoMedium"                     "VanGogh1"                       
[25] "Arches"                          "aurora"                         
[27] "bamako"                          "bright"                         
[29] "samarqand"                       "Hokusai3"                       
[31] "Klimt"                           "Austria"                        
[33] "MarnieMedium1"                   "Kandinsky"                      
[35] "lapaz"                           "Hokusai2"                       
[37] "vapoRwave"                       "Blue-Red 3"                     
[39] "PonyoLight"                      "J_M_W_Turner"                   
[41] "muted"                           "bamako"                         
[43] "Atentado"                        "Acadia"                         
p <- ggplot2::ggplot(
  county_monthly,
  ggplot2::aes(x = median, y = region, fill = region)
) +
  ggridges::geom_density_ridges(
    scale = 1.4,
    alpha = 0.85,
    bandwidth = 0.08,
    rel_min_height = 0.01,
    color = "white",
    linewidth = 0.3
  ) +
  paletteer::scale_fill_paletteer_d("PNWColors::Starfish") +
  ggplot2::facet_wrap(~ year, nrow = 1) +
  ggplot2::scale_x_continuous(
    limits = c(0, 3),
    breaks = seq(0, 3, 0.5),
    labels = function(x) paste0(x, "%")
  ) +
  ggplot2::labs(
    title = "Where Norway's Farmed Salmon Die",
    subtitle = "Monthly median mortality rate distributions by county, 2020–2025",
    x = "Monthly median mortality rate",
    y = NULL,
    caption = "Source: Norwegian Veterinary Institute (Laksetap) | TidyTuesday 2026-03-17"
  ) +
  ggplot2::theme_minimal(base_size = 13) +
  ggplot2::theme(
    legend.position = "none",
    plot.title = ggtext::element_markdown(face = "bold", size = 18, margin = ggplot2::margin(b = 4)),
    plot.subtitle = ggplot2::element_text(color = "grey40", size = 12, margin = ggplot2::margin(b = 12)),
    plot.caption = ggplot2::element_text(color = "grey60", size = 9, hjust = 0),
    strip.text = ggplot2::element_text(face = "bold", size = 12),
    panel.grid.major.y = ggplot2::element_blank(),
    panel.grid.minor = ggplot2::element_blank(),
    axis.text.y = ggplot2::element_text(size = 10),
    plot.margin = ggplot2::margin(15, 15, 10, 10)
  )

p

Absolute losses by county

The mortality rate tells one story, but absolute numbers tell another. A small county with a high rate may lose fewer fish than a massive production hub with a moderate rate.

county_losses <- losses %>%
  filter(geo_group == "county", species == "salmon") %>%
  mutate(
    year = year(date),
    # Combine Agder + Rogaland to match mortality data's grouping
    region = case_when(
      region %in% c("Agder", "Rogaland") ~ "Agder & Rogaland",
      TRUE ~ region
    )
  ) %>%
  # Drop Akershus (0 across all months)
  filter(region != "Akershus")

# Verify counties
cat("Counties in losses data (after combining):\n")
Counties in losses data (after combining):
print(sort(unique(county_losses$region)))
[1] "Agder & Rogaland" "Finnmark"         "Møre og Romsdal"  "Nordland"        
[5] "Troms"            "Trøndelag"        "Vestland"        
yearly_county <- county_losses %>%
  group_by(region, year) %>%
  summarise(
    total_dead = sum(dead, na.rm = TRUE),
    total_escaped = sum(escaped, na.rm = TRUE),
    total_discarded = sum(discarded, na.rm = TRUE),
    .groups = "drop"
  )

cat(sprintf("\nyearly_county: %d rows\n", nrow(yearly_county)))

yearly_county: 42 rows
stopifnot("yearly_county is empty" = nrow(yearly_county) > 0)

# Overall totals
yearly_county %>%
  group_by(region) %>%
  summarise(
    total_dead_millions = round(sum(total_dead) / 1e6, 1),
    .groups = "drop"
  ) %>%
  arrange(desc(total_dead_millions))
# A tibble: 7 × 2
  region           total_dead_millions
  <chr>                          <dbl>
1 Vestland                        83  
2 Trøndelag                       62.7
3 Nordland                        54.2
4 Troms                           40.9
5 Møre og Romsdal                 35.6
6 Agder & Rogaland                31.9
7 Finnmark                        30  

The gap between best and worst

# Show the IQR (Q1 to Q3) spread by county over time
# This captures farm-level inequality within regions

county_spread <- mortality %>%
  filter(geo_group == "county", species == "salmon") %>%
  mutate(
    year = year(date),
    iqr = q3 - q1
  ) %>%
  group_by(region, year) %>%
  summarise(
    avg_median = mean(median, na.rm = TRUE),
    avg_q1 = mean(q1, na.rm = TRUE),
    avg_q3 = mean(q3, na.rm = TRUE),
    avg_iqr = mean(iqr, na.rm = TRUE),
    .groups = "drop"
  )

cat(sprintf("county_spread: %d rows\n", nrow(county_spread)))
county_spread: 42 rows
stopifnot("county_spread is empty" = nrow(county_spread) > 0)

# Sanity check
if (length(unique(county_spread$avg_iqr)) == 1) {
  warning("All IQR values are identical — check grouping logic")
}

p3 <- ggplot2::ggplot(
  county_spread,
  ggplot2::aes(x = year, y = avg_median, ymin = avg_q1, ymax = avg_q3,
               fill = region, color = region)
) +
  ggplot2::geom_ribbon(alpha = 0.2, color = NA) +
  ggplot2::geom_line(linewidth = 1) +
  ggplot2::geom_point(size = 1.5) +
  ggplot2::facet_wrap(~ region, ncol = 2) +
  paletteer::scale_fill_paletteer_d("PNWColors::Starfish") +
  paletteer::scale_color_paletteer_d("PNWColors::Starfish") +
  ggplot2::scale_y_continuous(labels = function(x) paste0(x, "%")) +
  ggplot2::scale_x_continuous(breaks = c(2020, 2022, 2024)) +
  ggplot2::labs(
    title = "Mortality Rate Spread Across Norwegian Counties",
    subtitle = "Ribbon shows Q1–Q3 range of farm-level mortality; line is the median.\nWider ribbons = more inequality between best and worst farms in that county.",
    x = NULL,
    y = "Monthly mortality rate",
    caption = "Source: Norwegian Veterinary Institute (Laksetap) | TidyTuesday 2026-03-17"
  ) +
  ggplot2::theme_minimal(base_size = 12) +
  ggplot2::theme(
    legend.position = "none",
    plot.title = ggtext::element_markdown(face = "bold", size = 16),
    plot.subtitle = ggplot2::element_text(color = "grey40", size = 10, margin = ggplot2::margin(b = 10)),
    plot.caption = ggplot2::element_text(color = "grey60", size = 9, hjust = 0),
    strip.text = ggplot2::element_text(face = "bold", size = 11),
    panel.grid.minor = ggplot2::element_blank()
  )

p3

NoteReading the spread

A county where the IQR ribbon is narrow has farms performing similarly — good or bad, they’re consistent. A wide ribbon means some farms in that county are doing well while others are hemorrhaging fish. Policy interventions might focus differently: narrow-and-high counties need systemic change, while wide-ribbon counties need to bring their worst performers up to the level of their best.

Salmon vs. rainbow trout

# Country-level comparison
species_compare <- mortality %>%
  filter(geo_group == "country") %>%
  mutate(year = year(date)) %>%
  group_by(species, year) %>%
  summarise(
    avg_median = round(mean(median, na.rm = TRUE), 3),
    .groups = "drop"
  ) %>%
  pivot_wider(names_from = species, values_from = avg_median)

cat("Annual median mortality rates (country-level):\n")
Annual median mortality rates (country-level):
species_compare
# A tibble: 6 × 3
   year rainbowtrout salmon
  <dbl>        <dbl>  <dbl>
1  2020        0.673  0.507
2  2021        0.716  0.57 
3  2022        0.781  0.59 
4  2023        0.607  0.652
5  2024        0.6    0.576
6  2025        0.696  0.517

Rainbow trout consistently shows higher median mortality rates than salmon, despite being a much smaller share of Norwegian aquaculture production. The trout industry may face different biological pressures or have less operational investment in mortality reduction.

Final thoughts and takeaways

Vestland is the epicenter. With 83 million salmon deaths over six years and the highest average mortality rate (0.80%), Vestland stands apart. It’s also Norway’s largest salmon-producing county, so some of this is scale — but the rate being highest suggests structural challenges beyond just volume.

2023 was the turning point — maybe. National salmon deaths peaked at 62.8 million in 2023 and declined in 2024–2025. Several counties show similar patterns. Whether this reflects genuine policy impact from Norway’s aquaculture reforms or just cyclical variation will take more years to confirm.

The IQR tells the real story. Average mortality rates can mask enormous variation between farms within the same county. Vestland’s wide Q1–Q3 spread suggests that some farms there have cracked the code while others haven’t — a more targeted intervention strategy could focus on knowledge transfer rather than blanket regulation.

The North does better. Troms and Nordland consistently show the lowest mortality rates and narrowest spreads. Colder waters, lower sea lice pressure, or simply more space between farms may all contribute. As the industry considers Arctic expansion, these mortality advantages are part of the calculus.

338 million dead salmon in six years is a staggering number by any measure. Norway’s ambition to grow its aquaculture industry while reducing mortality is one of the more concrete sustainability challenges in global food production — and this data makes it clear where the hardest work remains.