# A tibble: 24 × 4
# Groups: bigregion, race [12]
bigregion race sex n
<fct> <fct> <fct> <int>
1 Northeast White Male 165
2 Northeast White Female 217
3 Northeast Black Male 24
4 Northeast Black Female 36
5 Northeast Other Male 15
6 Northeast Other Female 31
7 Midwest White Male 274
8 Midwest White Female 285
9 Midwest Black Male 43
10 Midwest Black Female 46
# ℹ 14 more rows
Equivalently, but the result is not grouped:
gss_sm |>count(bigregion, race, sex)
# A tibble: 24 × 4
bigregion race sex n
<fct> <fct> <fct> <int>
1 Northeast White Male 165
2 Northeast White Female 217
3 Northeast Black Male 24
4 Northeast Black Female 36
5 Northeast Other Male 15
6 Northeast Other Female 31
7 Midwest White Male 274
8 Midwest White Female 285
9 Midwest Black Male 43
10 Midwest Black Female 46
# ℹ 14 more rows
# A tibble: 24 × 5
# Groups: bigregion, race [12]
bigregion race sex n prop
<fct> <fct> <fct> <int> <dbl>
1 Northeast White Male 165 0.432
2 Northeast White Female 217 0.568
3 Northeast Black Male 24 0.4
4 Northeast Black Female 36 0.6
5 Northeast Other Male 15 0.326
6 Northeast Other Female 31 0.674
7 Midwest White Male 274 0.490
8 Midwest White Female 285 0.510
9 Midwest Black Male 43 0.483
10 Midwest Black Female 46 0.517
# ℹ 14 more rows
When the result is not grouped, what do you get as the proportions?
# A tibble: 24 × 5
bigregion race sex n prop
<fct> <fct> <fct> <int> <dbl>
1 Northeast White Male 165 0.0576
2 Northeast White Female 217 0.0757
3 Northeast Black Male 24 0.00837
4 Northeast Black Female 36 0.0126
5 Northeast Other Male 15 0.00523
6 Northeast Other Female 31 0.0108
7 Midwest White Male 274 0.0956
8 Midwest White Female 285 0.0994
9 Midwest Black Male 43 0.0150
10 Midwest Black Female 46 0.0160
# ℹ 14 more rows
Check your work by summing the rows.
Source Code
---title: "Example 05: Tables and dplyr"engine: knitr---```{r}#| echo: falseknitr::opts_chunk$set(engine.opts =list(zsh ="-l"))```# Crosstabs```{r}library(tidyverse)library(socviz)gss_sm```Count up one variable:```{r}gss_sm |>group_by(bigregion) |>tally()```Cross-tabulate:```{r}gss_sm |>group_by(bigregion, religion) |>tally()```Notice the difference:```{r}gss_sm |>group_by(religion, bigregion) |>tally()```Seems similar, but now try:```{r}# Religion within bigregion;gss_sm |>group_by(bigregion, religion) |>tally() |>summarize(group_total =sum(n)) |>mutate(prop = group_total/sum(group_total))``````{r}# Bigregion within religiongss_sm |>group_by(religion, bigregion) |>tally() |>summarize(group_total =sum(n)) |>mutate(prop = group_total/sum(group_total))```To get our table to look like a conventional nxm crosstab, pivot it:```{r}gss_sm |>group_by(bigregion, religion) |>tally() |>pivot_wider(names_from = religion, values_from = n)``````{r}# Bigregion within religiongss_sm |>group_by(religion, bigregion) |>tally() |>summarize(group_total =sum(n)) |>mutate(prop = group_total/sum(group_total))```Grouped and counted by region within religion:```{r}gss_sm |>group_by(religion, bigregion) |>tally() |>pivot_wider(names_from = bigregion, values_from = n)```As many dimensions as we wish:```{r}gss_sm |>group_by(bigregion, race, sex) |>tally()```Equivalently, but the result is not grouped:```{r}gss_sm |>count(bigregion, race, sex)```Add a frequency column:```{r}gss_sm |>group_by(bigregion, race, sex) |>tally() |>mutate(prop = n/sum(n))```When the result is not grouped, what do you get as the proportions?```{r}gss_sm |>count(bigregion, race, sex) |>mutate(prop = n/sum(n))```Check your work by summing the rows.