# A tibble: 650 × 1
constituency
<chr>
1 Aberavon
2 Aberconwy
3 Aberdeen North
4 Aberdeen South
5 Aberdeenshire West & Kincardine
6 Airdrie & Shotts
7 Aldershot
8 Aldridge-Brownhills
9 Altrincham & Sale West
10 Alyn & Deeside
# ℹ 640 more rows
Tally them up:
ukvote2019 |>distinct(constituency) |>tally()
# A tibble: 1 × 1
n
<int>
1 650
That is, there are 650 electoral constituencies in Great Britain and Northern Ireland.
A quicker way of establishing how many constituencies there are:
ukvote2019 |>count(constituency)
# A tibble: 650 × 2
constituency n
<chr> <int>
1 Aberavon 7
2 Aberconwy 4
3 Aberdeen North 6
4 Aberdeen South 4
5 Aberdeenshire West & Kincardine 4
6 Airdrie & Shotts 5
7 Aldershot 4
8 Aldridge-Brownhills 5
9 Altrincham & Sale West 6
10 Alyn & Deeside 5
# ℹ 640 more rows
Which parties fielded the most candidates?
ukvote2019 |>count(party_name) |>arrange(desc(n))
# A tibble: 69 × 2
party_name n
<chr> <int>
1 Conservative 636
2 Labour 631
3 Liberal Democrat 611
4 Green 497
5 The Brexit Party 275
6 Independent 224
7 Scottish National Party 59
8 UKIP 44
9 Plaid Cymru 36
10 Christian Peoples Alliance 29
# ℹ 59 more rows
What are the Top 5 parties by n candidates?
ukvote2019 |>count(party_name) |>slice_max(order_by = n, n =5)
# A tibble: 5 × 2
party_name n
<chr> <int>
1 Conservative 636
2 Labour 631
3 Liberal Democrat 611
4 Green 497
5 The Brexit Party 275
Bottom 5? Does this make sense?
ukvote2019 |>count(party_name) |>slice_min(order_by = n, n =5)
# A tibble: 25 × 2
party_name n
<chr> <int>
1 Ashfield Independents 1
2 Best for Luton 1
3 Birkenhead Social Justice Party 1
4 British National Party 1
5 Burnley & Padiham Independent Party 1
6 Church of the Militant Elvis Party 1
7 Citizens Movement Party UK 1
8 CumbriaFirst 1
9 Heavy Woollen District Independents 1
10 Independent Network 1
# ℹ 15 more rows
4. Filtering
Filtering is subsetting the rows according to a condition in one or more of the columns
Show me all and only the Green party candidates.
ukvote2019 |>filter(party_name =="Green")
# A tibble: 497 × 13
cid constituency electorate party_name candidate votes vote_share_percent
<chr> <chr> <int> <chr> <chr> <int> <dbl>
1 W07000… Aberavon 50747 Green Giorgia … 450 1.4
2 S14000… Aberdeen No… 62489 Green Guy Inge… 880 2.4
3 S14000… Airdrie & S… 64008 Green Rosemary… 685 1.7
4 E14000… Aldershot 72617 Green Donna Wa… 1750 3.7
5 E14000… Aldridge-Br… 60138 Green Bill McC… 771 2
6 E14000… Altrincham … 73096 Green Geraldin… 1566 2.9
7 E14000… Amber Valley 69976 Green Lian Piz… 1388 3
8 E14000… Arundel & S… 81726 Green Isabel T… 2519 4.1
9 E14000… Ashfield 78204 Green Rose Woo… 674 1.4
10 E14000… Ashford 89550 Green Mandy Ro… 2638 4.4
# ℹ 487 more rows
# ℹ 6 more variables: vote_share_change <dbl>, total_votes_cast <int>,
# vrank <int>, turnout <dbl>, fname <chr>, lname <chr>
Show me all candidates named “Michael”.
ukvote2019 |>filter(fname =="Michael")
# A tibble: 25 × 13
cid constituency electorate party_name candidate votes vote_share_percent
<chr> <chr> <int> <chr> <chr> <int> <dbl>
1 E14000… Basildon So… 74441 Liberal D… Michael … 1957 4.3
2 N06000… Belfast Sou… 69984 Ulster Un… Michael … 1259 2.7
3 E14000… Blaydon 67853 The Brexi… Michael … 5833 12.8
4 E14000… Bosworth 81537 Liberal D… Michael … 9096 16.1
5 E14000… Bury South 75152 Independe… Michael … 277 0.6
6 E14000… Canterbury 80203 Independe… Michael … 505 0.8
7 W07000… Cardiff Nor… 68438 Green Michael … 820 1.6
8 E14000… Dorset Mid … 65426 Conservat… Michael … 29548 60.4
9 S14000… Dundee East 66210 Liberal D… Michael … 3573 7.9
10 E14000… Durham Nort… 72166 Liberal D… Michael … 2831 5.9
# ℹ 15 more rows
# ℹ 6 more variables: vote_share_change <dbl>, total_votes_cast <int>,
# vrank <int>, turnout <dbl>, fname <chr>, lname <chr>
Show me all Green party candidates named “Michael”.
# A tibble: 10 × 2
party_name n
<chr> <int>
1 Conservative 366
2 Labour 202
3 Scottish National Party 48
4 Liberal Democrat 11
5 Democratic Unionist Party 8
6 Sinn Féin 7
7 Plaid Cymru 4
8 Social Democratic & Labour Party 2
9 Alliance Party 1
10 Green 1
What happens if you leave out ungroup() in the chunk above?
4. Have a go
Can you find …
The candidate who won the most votes in the country?
The candidate with the largest vote share in the country?
The median vote share of winning candidates?
The largest vote share swing from previous election?
Overall turnout for the whole country?
Median turnout across constituencies?
Source Code
---title: "Tasks for Week 05"date: last-modified---## UK Election Data```{r}library(tidyverse)```### 1. Install the UK Election Data packageIt's not on CRAN, it's on my GitHub.```{r}#| eval: false# You only need to do this onceremotes::install_github("kjhealy/ukelection2019")```### 2. Load the package```{r }#| label: "03b-dplyr-basics-16"library(ukelection2019)ukvote2019```Each row is a candidate standing in a particular constituency (in US speak, a district) for a particular party or as an independent candidate.### 3. Get familiar with the dataUse [**`sample_n()`**]{.fg-green} to sample `n` rows of your tibble.```{r }#| label: "03b-dplyr-basics-17"ukvote2019 |> sample_n(10)```A vector of unique constituency names:```{r}#| label: "03b-dplyr-basics-18"ukvote2019 |>distinct(constituency)```Tally them up:```{r}#| label: "03b-dplyr-basics-19"ukvote2019 |>distinct(constituency) |>tally()```That is, there are 650 electoral constituencies in Great Britain and Northern Ireland.A quicker way of establishing how many constituencies there are:```{r}#| label: "03b-dplyr-basics-24"ukvote2019 |>count(constituency) ```Which parties fielded the most candidates?```{r}#| label: "03b-dplyr-basics-21"ukvote2019 |>count(party_name) |>arrange(desc(n))```What are the Top 5 parties by n candidates?```{r}#| label: "03b-dplyr-basics-22"ukvote2019 |>count(party_name) |>slice_max(order_by = n, n =5)```Bottom 5? Does this make sense?```{r}#| label: "03b-dplyr-basics-23"ukvote2019 |>count(party_name) |>slice_min(order_by = n, n =5)```### 4. Filtering Filtering is subsetting the rows according to a condition in one or more of the columnsShow me all and only the Green party candidates.```{r}ukvote2019 |>filter(party_name =="Green")```Show me all candidates named "Michael". ```{r}ukvote2019 |>filter(fname =="Michael")```Show me all Green party candidates named "Michael". ```{r}ukvote2019 |>filter(party_name =="Green"& fname =="Michael")```### 5. GroupingWho won in each constituency?```{r}ukvote2019 |>group_by(constituency) |>slice_max(votes)```What happens if you leave out `group_by()` in the chunk of code above? How do I count the number of seats each party won?```{r}ukvote2019 |>group_by(constituency) |>slice_max(votes) |>group_by(party_name) |>tally() |>arrange(desc(n))```### Group and Summarize```{r}ukvote2019 |>group_by(constituency) |>slice_max(votes) |>ungroup() |>summarize(mean_winner_share =mean(vote_share_percent))```What happens if you leave out `ungroup()` in the chunk above?### 4. Have a goCan you find ...- The candidate who won the most votes in the country?- The candidate with the largest vote share in the country?- The median vote share of winning candidates?- The largest vote share swing from previous election?- Overall turnout for the whole country?- Median turnout across constituencies?