This dataset explores Brazil’s open CNPJ (Cadastro Nacional da Pessoa Jurídica) records from the Brazilian Ministry of Finance. The raw company records were cleaned and enriched with lookup tables (legal nature, owner qualification, and company size), then filtered to retain firms above a share-capital threshold.
Which legal nature categories have the highest total and average capital stock?
How does company size relate to capital stock distribution?
Which owner qualification groups dominate high-capital companies?
Loading necessary packages
My handy booster pack that allows me to install (if needed) and load my usual and favorite packages, as well as some helpful functions.
raw <- tidytuesdayR::tt_load('2026-01-27')companies <- raw$companies
Exploratory Data Analysis
The my_skim() function is a modified version of the skimr::skim() function that returns the number of missing data points (cells as NA) as well as the inverse (e.g.: number of rows that are notNA), the count, minimum, 25%, median, 75%, max, mean, geometric mean, and standard deviation. It also generates a little ASCII histogram. Neat!
# A tibble: 13 × 2
owner_qualification n
<chr> <int>
1 Managing Partner / Partner-Administrator 107027
2 Administrator / Manager 15236
3 Entrepreneur / Business Owner 15201
4 Director / Officer 1634
5 President / Chair 1343
6 Beneficial Owner (individual) resident or domiciled in Brazil 442
7 Judicial Administrator (Court-appointed) 302
8 Sole Owner of an Individual Real Estate Company 50
9 Ostensible Partner (Managing partner in a silent partnership) 32
10 Liquidator 29
11 Executor / Estate Administrator 22
12 Attorney-in-fact / Legal Representative (Power of Attorney) 13
13 Intervenor / Court-appointed Administrator 1
Capital Stock Analysis
Distribution by Company Size
The capital stock distribution is likely highly right-skewed — a few massive firms alongside many small ones. Let’s see how the size categories compare.
# A tibble: 10 × 2
owner_qualification n
<chr> <int>
1 Managing Partner / Partner-Administrator 6874
2 Administrator / Manager 4256
3 Entrepreneur / Business Owner 1289
4 Director / Officer 966
5 President / Chair 568
6 Judicial Administrator (Court-appointed) 128
7 Beneficial Owner (individual) resident or domiciled in Brazil 26
8 Liquidator 12
9 Ostensible Partner (Managing partner in a silent partnership) 6
10 Executor / Estate Administrator 5
Visualizing Capital Stock Distribution
The hero plot shows the log-scaled capital stock distribution across company sizes, with annotations for the median values and a callout for the concentration of capital in a few large entities.
# Brazilian flag-inspired palettebrazil_cols <-c("#009739", # green"#FEDD00", # yellow"#002776"# blue)plot_data <- companies %>%filter(!is.na(company_size), !is.na(capital_stock), capital_stock >0)# Calculate medians for annotationsize_medians <- plot_data %>%group_by(company_size) %>%summarize(med =median(capital_stock),n =n(),.groups ="drop" )ggplot(plot_data, aes(x = capital_stock, fill = company_size)) +geom_density(alpha =0.6, color =NA) +geom_vline(data = size_medians,aes(xintercept = med, color = company_size),linetype ="dashed",linewidth =0.8 ) +geom_text(data = size_medians,aes(x = med,y =Inf,label =paste0("Median: R$", scales::comma(med)),color = company_size ),vjust =1.5,hjust =-0.1,size =3.5,fontface ="bold",show.legend =FALSE ) +scale_x_log10(labels = scales::label_dollar(prefix ="R$", big.mark =","),breaks =10^(seq(0, 12, by =2)) ) +scale_fill_manual(values = brazil_cols, name ="Company Size") +scale_color_manual(values = brazil_cols, name ="Company Size") +labs(title ="Capital Stock Distribution of Brazilian Companies",subtitle ="Log-scaled density by company size category — CNPJ open registry data",x ="Capital Stock (BRL, log scale)",y ="Density",caption ="Source: TidyTuesday 2026-01-27 | Brazilian Ministry of Finance via dados.gov.br" ) +theme_minimal(base_size =13) +theme(plot.title =element_text(face ="bold", size =18, color ="#002776"),plot.subtitle =element_text(size =12, color ="#555555"),plot.caption =element_text(size =9, color ="#888888"),legend.position ="bottom",panel.grid.minor =element_blank() )
Final thoughts and takeaways
Brazil’s open CNPJ registry is a remarkable transparency tool — few countries publish their corporate registrations this openly. The capital stock distributions reveal the expected power-law shape: a long tail of micro and small enterprises with modest capitalization, and a handful of massive corporate entities that dominate total capital.
The legal nature breakdown is particularly interesting for understanding Brazil’s business landscape. Limited liability companies (Ltda.) vastly outnumber other forms, which tracks with their flexibility and lower compliance burden compared to corporations (S.A.). But when you look at total capital stock, the picture inverts — a small number of S.A. entities command disproportionate capital.
Note
Capital stock (capital social) in Brazil’s registry represents the declared investment by owners at incorporation or amendment. It’s a useful proxy for firm size but doesn’t capture retained earnings, debt, or market value — so the largest firms by capital stock aren’t necessarily the largest by revenue.
The owner qualification data adds another dimension: among the highest-capitalized firms, we see a concentration of specific professional qualifications that reflect Brazil’s regulatory requirements for certain industries.