Exploring the growing conditions of 146 edible plant species from the GROW Observatory — which plants thrive where, and what do sunlight and temperature preferences reveal about cultivation strategies?
The Edible Plant Database (EPD) contains information on 146 edible plant species, derived from the GROW Observatory, a European Citizen Science initiative focused on food cultivation, soil monitoring, and land observation. The dataset provides growing conditions and harvest/germination timelines to address questions like “What can I plant now” and “What will yield crops in the future.”
Do plants requiring more sunlight also require higher temperatures?
Which cultivation classes demand the most water?
Loading necessary packages
My handy booster pack that allows me to install (if needed) and load my usual and favorite packages, as well as some helpful functions.
raw <- tidytuesdayR::tt_load('2026-02-03')plants <- raw$edible_plants
Exploratory Data Analysis
The my_skim() function is a modified version of the skimr::skim() function that returns the number of missing data points (cells as NA) as well as the inverse (e.g.: number of rows that are notNA), the count, minimum, 25%, median, 75%, max, mean, geometric mean, and standard deviation. It also generates a little ASCII histogram. Neat!
Edible Plants
I’ll drop the free-text columns (description, requirements, nutritional_info, sensitivities) since they’re not useful for quantitative analysis, and focus on the structured growing condition fields.
Let’s also get a quick look at the categorical columns to understand the levels we’re working with:
plants %>%count(sunlight, sort =TRUE)
# A tibble: 6 × 2
sunlight n
<chr> <int>
1 Full sun 87
2 Full sun/partial shade 44
3 Partial shade 5
4 Full sun/partial shade/full shade 2
5 full sun/partial shade/ full shade 1
6 partial shade 1
plants %>%count(water, sort =TRUE)
# A tibble: 8 × 2
water n
<chr> <int>
1 Medium 93
2 High 24
3 Low 18
4 Very High 1
5 Very Low 1
6 Very low 1
7 high 1
8 very high 1
# A tibble: 6 × 2
temperature_class n
<chr> <int>
1 Hardy 67
2 Tender 37
3 Very hardy 21
4 Half hardy 12
5 Very tender 2
6 Very hard 1
plants %>%count(season, sort =TRUE)
# A tibble: 12 × 2
season n
<chr> <int>
1 <NA> 73
2 Perennial 37
3 Annual 12
4 biennial 8
5 biennial, grown as annual 2
6 perennial 2
7 Annual/perannial 1
8 Biennial, grown as an annual 1
9 Perrenial 1
10 Perrenial evergreen 1
11 Semi-evergreen perrenial 1
12 Shrub 1
Growing Condition Analysis
The two questions from the TidyTuesday repo are closely related — both are about how growing requirements cluster together. Let’s tackle them systematically.
Sunlight vs. Temperature Preferences
Do sun-loving plants also prefer warmer conditions? Let’s look at the cross-tabulation of sunlight requirements and temperature class.
# A tibble: 12 × 5
common_name cultivation energy sunlight water
<chr> <chr> <dbl> <chr> <chr>
1 Beans (Broad) Legume 88 Full sun/partial shade/full shade Very…
2 Pea Legume 80 Full sun Very…
3 Kale Brassica 50 Full sun Low
4 Brussels Sprouts Brassica 35 Full sun Medi…
5 Broccoli Brassica 34 Full sun/partial shade Medi…
6 Cauliflower Brassica 31 Full sun Medi…
7 Bell Pepper Solanaceae 31 Full sun/partial shade Medi…
8 Beans (Runner) Legume 27 Full sun Medi…
9 Beans (French) Legume 27 Full sun Medi…
10 Cabbage (Spring) Legume 25 Full sun High
11 Beetroot Umbelliferae 0 Full sun/partial shade Medi…
12 Endive Miscellaneous 0 Full sun/partial shade Medi…
Visualizing Growing Conditions
The hero plot pairs the two suggested questions into a single multi-panel layout: sunlight × temperature association on the left, and water demand by cultivation class on the right.
# Define a botanical color palettegarden_cols <-c("Full sun"="#E8A838","Full sun / partial shade"="#C4A24D","Partial shade"="#7BA05B","Full shade"="#2D5F2D","Partial shade / full shade"="#4A7A4A")water_cols <-c("Low"="#D4A76A","Medium"="#5B8C5A","High"="#2E6B8A")# Panel 1: Sunlight vs Temperature heatmapp1_data <- plants %>%filter(!is.na(sunlight), !is.na(temperature_class)) %>%count(sunlight, temperature_class)p1 <-ggplot(p1_data, aes(x = temperature_class, y = sunlight, fill = n)) +geom_tile(color ="white", linewidth =1.5) +geom_text(aes(label = n), size =5, fontface ="bold", color ="white") +scale_fill_gradient(low ="#A8D5A2", high ="#1B5E20", name ="Count") +labs(title ="Do sun-loving plants prefer warmer conditions?",x ="Temperature Class",y ="Sunlight Requirement" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =14),panel.grid =element_blank(),legend.position ="bottom" )# Panel 2: Water demand by cultivation classp2_data <- plants %>%filter(!is.na(water), !is.na(cultivation)) %>%count(cultivation, water) %>%group_by(cultivation) %>%mutate(pct = n /sum(n)) %>%ungroup()p2 <-ggplot( p2_data,aes(x =reorder(cultivation, -n, sum), y = pct, fill = water)) +geom_col(position ="fill", width =0.7) +scale_fill_manual(values = water_cols, name ="Water Need") +scale_y_continuous(labels = scales::percent_format()) +labs(title ="Which cultivation classes demand the most water?",x ="Cultivation Class",y ="Proportion of Plants" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =14),axis.text.x =element_text(angle =45, hjust =1),panel.grid.minor =element_blank(),legend.position ="bottom" )# Combine with patchworkcombined <- p1 / p2 +plot_annotation(title ="Growing Conditions of Edible Plants",subtitle ="146 species from the GROW Observatory Edible Plant Database",caption ="Source: TidyTuesday 2026-02-03 | University of Dundee Edible Plant Database",theme =theme(plot.title =element_text(size =18, face ="bold", color ="#2D5F2D"),plot.subtitle =element_text(size =13, color ="#555555"),plot.caption =element_text(size =9, color ="#888888") ) )combined
Final thoughts and takeaways
The Edible Plant Database offers a compact but revealing snapshot of how 146 food-producing species relate to their growing environments. The heatmap of sunlight versus temperature preferences shows that the vast majority of edible plants cluster in the “full sun” and warm-to-cool temperature range — which makes intuitive sense, as most food crops have been bred for productive, sun-drenched conditions rather than shade tolerance.
The water demand breakdown by cultivation class tells a complementary story. Root vegetables and legumes tend toward moderate water needs, while leafy greens and some fruiting crops lean heavier. This kind of profiling is exactly what the GROW Observatory aimed to support: giving citizen scientists and home gardeners a data-driven way to plan what to grow based on their local conditions.
Tip
If you’re planning a garden, the pH preference data is particularly actionable — most edible plants cluster in the 5.5–7.5 range, but there’s meaningful variation. Testing your soil pH before planting season can save a lot of heartache.
One limitation: many of the numeric fields (germination days, harvest days, temperature ranges) are stored as character strings with range notation (e.g., “10-14”), which limits direct quantitative analysis without parsing. A natural extension would be to extract those ranges into min/max numeric columns for more granular modeling.