dyplyr

  • dplyr is a package tailed towards data manipulation.
  • filter() picks cases based on their values.
  • select() picks variables based on their names.
  • mutate() adds and creates new variables
  • summarise() reduces multiple values down to a single summary.
  • group_by() allows to perform any operation by grouping variables.
  • arrange() changes the ordering of the rows.

filter() picks cases based on their values

filter(.data, ...)

# Without %>%:
filter(mtcars, disp > 350 & mpg < 11)

# with %>%
mtcars %>%
  filter(disp > 350 & mpg < 11)

Note: You don’t have to provide the name of a data object within the dplyr function!

not: filter(mtcars, mtcars$disp > 350 & mtcars$mpg < 11)

but: filter(mtcars, disp > 350 & mpg < 11)

Task

  • Take the starwars database.
  • Filter all cases with brown hair color and female gender (Hint: variables hair_color and sex)

:-)

library(tidyverse)
starwars %>%
  filter(hair_color == "brown" & sex == "female")
name height mass hair_color skin_color eye_color birth_year sex gender homeworld species films vehicles starships
Leia Organa 150 49 brown light brown 19 female feminine Alderaan Human The Empire Strikes Back, Revenge of the Sith , Return of the Jedi , A New Hope , The Force Awakens Imperial Speeder Bike
Beru Whitesun lars 165 75 brown light blue 47 female feminine Tatooine Human Attack of the Clones, Revenge of the Sith , A New Hope
Cordé 157 NA brown light brown NA female feminine Naboo Human Attack of the Clones
Dormé 165 NA brown light brown NA female feminine Naboo Human Attack of the Clones
Rey NA NA brown light hazel NA female feminine NA Human The Force Awakens
Padmé Amidala 165 45 brown light brown 46 female feminine Naboo Human Attack of the Clones, The Phantom Menace , Revenge of the Sith H-type Nubian yacht, Naboo star skiff , Naboo fighter

select() picks variables based on their names

select(.data, ...)

# Either column numbers or column names
mtcars %>% select(1, 3, carb)

# Use the `:` symbol for ranges
mtcars %>% select(1:3, am:carb)

# use `-` to drop variables and keep the others
mtcars %>% select(-am, -(disp:drat))

# use `!` to select all variables not the ones defined
mtcars %>% select(!(1:3))

Task

  • Take the starwars database.
  • Filter for Human and select the variables name, height, and mass (Hint: variable species)

:-)

starwars %>% 
  filter(species == "Human") %>% 
  select(name:mass)
name height mass
Luke Skywalker 172 77.0
Darth Vader 202 136.0
Leia Organa 150 49.0
Owen Lars 178 120.0
Beru Whitesun lars 165 75.0
Biggs Darklighter 183 84.0
Obi-Wan Kenobi 182 77.0
Anakin Skywalker 188 84.0
Wilhuff Tarkin 180 NA
Han Solo 180 80.0
Wedge Antilles 170 77.0
Jek Tono Porkins 180 110.0
Palpatine 170 75.0
Boba Fett 183 78.2
Lando Calrissian 177 79.0
Lobot 175 79.0
Mon Mothma 150 NA
Arvel Crynyd NA NA
Qui-Gon Jinn 193 89.0
Finis Valorum 170 NA
Shmi Skywalker 163 NA
Mace Windu 188 84.0
Gregar Typho 185 85.0
Cordé 157 NA
Cliegg Lars 183 NA
Dormé 165 NA
Dooku 193 80.0
Bail Prestor Organa 191 NA
Jango Fett 183 79.0
Jocasta Nu 167 NA
Raymus Antilles 188 79.0
Finn NA NA
Rey NA NA
Poe Dameron NA NA
Padmé Amidala 165 45.0

Help functions to select variables

  • contains(): selects all variables which names contain a certain string.
  • starts_with(): selects all variables which start with a certain string.
  • ends_with(): selects all variables which end with a certain string.
  • num_range(): selects variables with a certain prefix within a certain range
mtcars %>% select(starts_with("d")) %>% slice(1:2)
mtcars %>% select(!ends_with("b")) %>% slice(1:2)
mtcars %>% select(contains("ra")) %>% slice(1:2)
billboard %>% select(num_range("wk", 10:15))

relocate() to rearrange columns

The relocate() function helps to rearrange the columns of a data frame

starwars %>% 
  relocate(name, birth_year, sex)
name birth_year sex height mass hair_color skin_color eye_color gender homeworld species films vehicles starships
Luke Skywalker 19.0 male 172 77.0 blond fair blue masculine Tatooine Human The Empire Strikes Back, Revenge of the Sith , Return of the Jedi , A New Hope , The Force Awakens Snowspeeder , Imperial Speeder Bike X-wing , Imperial shuttle
C-3PO 112.0 none 167 75.0 NA gold yellow masculine Tatooine Droid The Empire Strikes Back, Attack of the Clones , The Phantom Menace , Revenge of the Sith , Return of the Jedi , A New Hope
R2-D2 33.0 none 96 32.0 NA white, blue red masculine Naboo Droid The Empire Strikes Back, Attack of the Clones , The Phantom Menace , Revenge of the Sith , Return of the Jedi , A New Hope , The Force Awakens
Darth Vader 41.9 male 202 136.0 none white yellow masculine Tatooine Human The Empire Strikes Back, Revenge of the Sith , Return of the Jedi , A New Hope TIE Advanced x1
Leia Organa 19.0 female 150 49.0 brown light brown feminine Alderaan Human The Empire Strikes Back, Revenge of the Sith , Return of the Jedi , A New Hope , The Force Awakens Imperial Speeder Bike
Owen Lars 52.0 male 178 120.0 brown, grey light blue masculine Tatooine Human Attack of the Clones, Revenge of the Sith , A New Hope
Beru Whitesun lars 47.0 female 165 75.0 brown light blue feminine Tatooine Human Attack of the Clones, Revenge of the Sith , A New Hope
R5-D4 NA none 97 32.0 NA white, red red masculine Tatooine Droid A New Hope
Biggs Darklighter 24.0 male 183 84.0 black light brown masculine Tatooine Human A New Hope X-wing
Obi-Wan Kenobi 57.0 male 182 77.0 auburn, white fair blue-gray masculine Stewjon Human The Empire Strikes Back, Attack of the Clones , The Phantom Menace , Revenge of the Sith , Return of the Jedi , A New Hope Tribubble bongo Jedi starfighter , Trade Federation cruiser, Naboo star skiff , Jedi Interceptor , Belbullab-22 starfighter
Anakin Skywalker 41.9 male 188 84.0 blond fair blue masculine Tatooine Human Attack of the Clones, The Phantom Menace , Revenge of the Sith Zephyr-G swoop bike, XJ-6 airspeeder Trade Federation cruiser, Jedi Interceptor , Naboo fighter
Wilhuff Tarkin 64.0 male 180 NA auburn, grey fair blue masculine Eriadu Human Revenge of the Sith, A New Hope
Chewbacca 200.0 male 228 112.0 brown unknown blue masculine Kashyyyk Wookiee The Empire Strikes Back, Revenge of the Sith , Return of the Jedi , A New Hope , The Force Awakens AT-ST Millennium Falcon, Imperial shuttle
Han Solo 29.0 male 180 80.0 brown fair brown masculine Corellia Human The Empire Strikes Back, Return of the Jedi , A New Hope , The Force Awakens Millennium Falcon, Imperial shuttle
Greedo 44.0 male 173 74.0 NA green black masculine Rodia Rodian A New Hope
Jabba Desilijic Tiure 600.0 hermaphroditic 175 1358.0 NA green-tan, brown orange masculine Nal Hutta Hutt The Phantom Menace, Return of the Jedi, A New Hope
Wedge Antilles 21.0 male 170 77.0 brown fair hazel masculine Corellia Human The Empire Strikes Back, Return of the Jedi , A New Hope Snowspeeder X-wing
Jek Tono Porkins NA male 180 110.0 brown fair blue masculine Bestine IV Human A New Hope X-wing
Yoda 896.0 male 66 17.0 white green brown masculine NA Yoda’s species The Empire Strikes Back, Attack of the Clones , The Phantom Menace , Revenge of the Sith , Return of the Jedi
Palpatine 82.0 male 170 75.0 grey pale yellow masculine Naboo Human The Empire Strikes Back, Attack of the Clones , The Phantom Menace , Revenge of the Sith , Return of the Jedi
Boba Fett 31.5 male 183 78.2 black fair brown masculine Kamino Human The Empire Strikes Back, Attack of the Clones , Return of the Jedi Slave 1
IG-88 15.0 none 200 140.0 none metal red masculine NA Droid The Empire Strikes Back
Bossk 53.0 male 190 113.0 none green red masculine Trandosha Trandoshan The Empire Strikes Back
Lando Calrissian 31.0 male 177 79.0 black dark brown masculine Socorro Human The Empire Strikes Back, Return of the Jedi Millennium Falcon
Lobot 37.0 male 175 79.0 none light blue masculine Bespin Human The Empire Strikes Back
Ackbar 41.0 male 180 83.0 none brown mottle orange masculine Mon Cala Mon Calamari Return of the Jedi, The Force Awakens
Mon Mothma 48.0 female 150 NA auburn fair blue feminine Chandrila Human Return of the Jedi
Arvel Crynyd NA male NA NA brown fair brown masculine NA Human Return of the Jedi A-wing
Wicket Systri Warrick 8.0 male 88 20.0 brown brown brown masculine Endor Ewok Return of the Jedi
Nien Nunb NA male 160 68.0 none grey black masculine Sullust Sullustan Return of the Jedi Millennium Falcon
Qui-Gon Jinn 92.0 male 193 89.0 brown fair blue masculine NA Human The Phantom Menace Tribubble bongo
Nute Gunray NA male 191 90.0 none mottled green red masculine Cato Neimoidia Neimodian Attack of the Clones, The Phantom Menace , Revenge of the Sith
Finis Valorum 91.0 male 170 NA blond fair blue masculine Coruscant Human The Phantom Menace
Jar Jar Binks 52.0 male 196 66.0 none orange orange masculine Naboo Gungan Attack of the Clones, The Phantom Menace
Roos Tarpals NA male 224 82.0 none grey orange masculine Naboo Gungan The Phantom Menace
Rugor Nass NA male 206 NA none green orange masculine Naboo Gungan The Phantom Menace
Ric Olié NA NA 183 NA brown fair blue NA Naboo NA The Phantom Menace Naboo Royal Starship
Watto NA male 137 NA black blue, grey yellow masculine Toydaria Toydarian Attack of the Clones, The Phantom Menace
Sebulba NA male 112 40.0 none grey, red orange masculine Malastare Dug The Phantom Menace
Quarsh Panaka 62.0 NA 183 NA black dark brown NA Naboo NA The Phantom Menace
Shmi Skywalker 72.0 female 163 NA black fair brown feminine Tatooine Human Attack of the Clones, The Phantom Menace
Darth Maul 54.0 male 175 80.0 none red yellow masculine Dathomir Zabrak The Phantom Menace Sith speeder Scimitar
Bib Fortuna NA male 180 NA none pale pink masculine Ryloth Twi’lek Return of the Jedi
Ayla Secura 48.0 female 178 55.0 none blue hazel feminine Ryloth Twi’lek Attack of the Clones, The Phantom Menace , Revenge of the Sith
Dud Bolt NA male 94 45.0 none blue, grey yellow masculine Vulpter Vulptereen The Phantom Menace
Gasgano NA male 122 NA none white, blue black masculine Troiken Xexto The Phantom Menace
Ben Quadinaros NA male 163 65.0 none grey, green, yellow orange masculine Tund Toong The Phantom Menace
Mace Windu 72.0 male 188 84.0 none dark brown masculine Haruun Kal Human Attack of the Clones, The Phantom Menace , Revenge of the Sith
Ki-Adi-Mundi 92.0 male 198 82.0 white pale yellow masculine Cerea Cerean Attack of the Clones, The Phantom Menace , Revenge of the Sith
Kit Fisto NA male 196 87.0 none green black masculine Glee Anselm Nautolan Attack of the Clones, The Phantom Menace , Revenge of the Sith
Eeth Koth NA male 171 NA black brown brown masculine Iridonia Zabrak The Phantom Menace , Revenge of the Sith
Adi Gallia NA female 184 50.0 none dark blue feminine Coruscant Tholothian The Phantom Menace , Revenge of the Sith
Saesee Tiin NA male 188 NA none pale orange masculine Iktotch Iktotchi The Phantom Menace , Revenge of the Sith
Yarael Poof NA male 264 NA none white yellow masculine Quermia Quermian The Phantom Menace
Plo Koon 22.0 male 188 80.0 none orange black masculine Dorin Kel Dor Attack of the Clones, The Phantom Menace , Revenge of the Sith Jedi starfighter
Mas Amedda NA male 196 NA none blue blue masculine Champala Chagrian Attack of the Clones, The Phantom Menace
Gregar Typho NA male 185 85.0 black dark brown masculine Naboo Human Attack of the Clones Naboo fighter
Cordé NA female 157 NA brown light brown feminine Naboo Human Attack of the Clones
Cliegg Lars 82.0 male 183 NA brown fair blue masculine Tatooine Human Attack of the Clones
Poggle the Lesser NA male 183 80.0 none green yellow masculine Geonosis Geonosian Attack of the Clones, Revenge of the Sith
Luminara Unduli 58.0 female 170 56.2 black yellow blue feminine Mirial Mirialan Attack of the Clones, Revenge of the Sith
Barriss Offee 40.0 female 166 50.0 black yellow blue feminine Mirial Mirialan Attack of the Clones
Dormé NA female 165 NA brown light brown feminine Naboo Human Attack of the Clones
Dooku 102.0 male 193 80.0 white fair brown masculine Serenno Human Attack of the Clones, Revenge of the Sith Flitknot speeder
Bail Prestor Organa 67.0 male 191 NA black tan brown masculine Alderaan Human Attack of the Clones, Revenge of the Sith
Jango Fett 66.0 male 183 79.0 black tan brown masculine Concord Dawn Human Attack of the Clones
Zam Wesell NA female 168 55.0 blonde fair, green, yellow yellow feminine Zolan Clawdite Attack of the Clones Koro-2 Exodrive airspeeder
Dexter Jettster NA male 198 102.0 none brown yellow masculine Ojom Besalisk Attack of the Clones
Lama Su NA male 229 88.0 none grey black masculine Kamino Kaminoan Attack of the Clones
Taun We NA female 213 NA none grey black feminine Kamino Kaminoan Attack of the Clones
Jocasta Nu NA female 167 NA white fair blue feminine Coruscant Human Attack of the Clones
Ratts Tyerell NA male 79 15.0 none grey, blue unknown masculine Aleen Minor Aleena The Phantom Menace
R4-P17 NA none 96 NA none silver, red red, blue feminine NA Droid Attack of the Clones, Revenge of the Sith
Wat Tambor NA male 193 48.0 none green, grey unknown masculine Skako Skakoan Attack of the Clones
San Hill NA male 191 NA none grey gold masculine Muunilinst Muun Attack of the Clones
Shaak Ti NA female 178 57.0 none red, blue, white black feminine Shili Togruta Attack of the Clones, Revenge of the Sith
Grievous NA male 216 159.0 none brown, white green, yellow masculine Kalee Kaleesh Revenge of the Sith Tsmeu-6 personal wheel bike Belbullab-22 starfighter
Tarfful NA male 234 136.0 brown brown blue masculine Kashyyyk Wookiee Revenge of the Sith
Raymus Antilles NA male 188 79.0 brown light brown masculine Alderaan Human Revenge of the Sith, A New Hope
Sly Moore NA NA 178 48.0 none pale white NA Umbara NA Attack of the Clones, Revenge of the Sith
Tion Medon NA male 206 80.0 none grey black masculine Utapau Pau’an Revenge of the Sith
Finn NA male NA NA black dark dark masculine NA Human The Force Awakens
Rey NA female NA NA brown light hazel feminine NA Human The Force Awakens
Poe Dameron NA male NA NA brown light brown masculine NA Human The Force Awakens T-70 X-wing fighter
BB8 NA none NA NA none none black masculine NA Droid The Force Awakens
Captain Phasma NA NA NA NA unknown unknown unknown NA NA NA The Force Awakens
Padmé Amidala 46.0 female 165 45.0 brown light brown feminine Naboo Human Attack of the Clones, The Phantom Menace , Revenge of the Sith H-type Nubian yacht, Naboo star skiff , Naboo fighter

mutate() adds new variables that are functions of existing variables

mutate(.data, ...)

mtcars %>%
  mutate(
    efficiency = mpg / disp,
    effect = round(hp / mpg * 100)
  )
mpg cyl disp hp drat wt qsec vs am gear carb efficiency effect
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 0.1312500 524
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 0.1312500 524
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 0.2111111 408
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 0.0829457 514
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 0.0519444 936
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 0.0804444 580
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 0.0397222 1713
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 0.1663258 254
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 0.1619318 417
Merc 280 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 0.1145585 641
Merc 280C 17.8 6 167.6 123 3.92 3.440 18.90 1 0 4 4 0.1062053 691
Merc 450SE 16.4 8 275.8 180 3.07 4.070 17.40 0 0 3 3 0.0594634 1098
Merc 450SL 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 3 0.0627266 1040
Merc 450SLC 15.2 8 275.8 180 3.07 3.780 18.00 0 0 3 3 0.0551124 1184
Cadillac Fleetwood 10.4 8 472.0 205 2.93 5.250 17.98 0 0 3 4 0.0220339 1971
Lincoln Continental 10.4 8 460.0 215 3.00 5.424 17.82 0 0 3 4 0.0226087 2067
Chrysler Imperial 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 4 0.0334091 1565
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1 0.4116900 204
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2 0.4015852 171
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1 0.4767932 192
Toyota Corona 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 1 0.1790175 451
Dodge Challenger 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2 0.0487421 968
AMC Javelin 15.2 8 304.0 150 3.15 3.435 17.30 0 0 3 2 0.0500000 987
Camaro Z28 13.3 8 350.0 245 3.73 3.840 15.41 0 0 3 4 0.0380000 1842
Pontiac Firebird 19.2 8 400.0 175 3.08 3.845 17.05 0 0 3 2 0.0480000 911
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1 0.3455696 242
Porsche 914-2 26.0 4 120.3 91 4.43 2.140 16.70 0 1 5 2 0.2161264 350
Lotus Europa 30.4 4 95.1 113 3.77 1.513 16.90 1 1 5 2 0.3196635 372
Ford Pantera L 15.8 8 351.0 264 4.22 3.170 14.50 0 1 5 4 0.0450142 1671
Ferrari Dino 19.7 6 145.0 175 3.62 2.770 15.50 0 1 5 6 0.1358621 888
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8 0.0498339 2233
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2 0.1768595 509

Task

  • Take the starwars database.
  • Create a new variable with the body mass index bmi. Hint: bmi = kg / m²
  • Filter all humans with an bmi >= 26.
  • Select name, bmi, height, and mass

:-)

starwars %>%
  mutate(bmi = mass/ (height/100)^2) %>%
  filter(bmi >= 26 & species == "Human") %>%
  select(name, bmi, height, mass)
name bmi height mass
Luke Skywalker 26.02758 172 77
Darth Vader 33.33007 202 136
Owen Lars 37.87401 178 120
Beru Whitesun lars 27.54821 165 75
Wedge Antilles 26.64360 170 77
Jek Tono Porkins 33.95062 180 110

summarise() reduces multiple values down to a single summary.

mtcars %>%
  summarise(
    mean_mpg = mean(mpg),
    sd_mpg = sd(mpg),
    n = n()
  )
mean_mpg sd_mpg n
20.09062 6.026948 32

Task

  • Take the starwars database.
  • Filter Humans
  • Calculate the mean and the median for height and mass (Note: don’t forget to remove NA values: na.rm = TRUE)

:-)

starwars %>%
  filter(species == "Human") %>%
  summarise(
    mean_height = mean(height, na.rm = TRUE),
    median_height = median(height, na.rm = TRUE),
    mean_mass = mean(mass, na.rm = TRUE),
    median_mass = median(mass, na.rm = TRUE)
  )
mean_height median_height mean_mass median_mass
176.6452 180 82.78182 79

group_by() allows to perform any operation by grouping variables.

mtcars %>% group_by(cyl) %>%
  summarise(
    mean_mpg = mean(mpg),
    sd_mpg = sd(mpg),
    n = n()
  )
cyl mean_mpg sd_mpg n
4 26.66364 4.509828 11
6 19.74286 1.453567 7
8 15.10000 2.560048 14
mtcars %>% group_by(cyl, am) %>%
  summarise(
    mean_mpg = mean(mpg),
    sd_mpg = sd(mpg),
    n = n()
  )
cyl am mean_mpg sd_mpg n
4 0 22.90000 1.4525839 3
4 1 28.07500 4.4838599 8
6 0 19.12500 1.6317169 4
6 1 20.56667 0.7505553 3
8 0 15.05000 2.7743959 12
8 1 15.40000 0.5656854 2

Task

  • Take the starwars database.
  • Group by sex
  • Calculate the median for height and mass
  • And the number of cases per group (Hint: n())

:-)

starwars %>%
  group_by(sex) %>%
  summarise(
    median_height = median(height, na.rm = TRUE),
    median_mass = median(mass, na.rm = TRUE),
    n = n()
  )
sex median_height median_mass n
female 166 55.0 16
hermaphroditic 175 1358.0 1
male 183 80.0 60
none 97 53.5 6
NA 183 48.0 4

arrange() changes the ordering of the rows

arrange(.data, ...)

dat <- data.frame(age = c(5,5,6,7,6), sen = c(0, 1, 1, 0, 0), points = c(34, 55, 22, 11, 9))
dat %>%
  arrange(sen, age) 
age sen points
5 0 34
6 0 9
7 0 11
5 1 55
6 1 22

desc() specifies a variable to be arranged descending

dat %>%
  arrange(sen, desc(points))
age sen points
5 0 34
7 0 11
6 0 9
5 1 55
6 1 22

Task

  • Take the starwars database.
  • Calculate the bmi.
  • Summarize the median of the bmi grouped by species.
  • round the bmi to one decimal.
  • Also calculate the n for each group.
  • Only show cases with n > 2.
  • Arrange the resulting table. Sort by n in descending order.

:-)

starwars %>% 
  mutate(bmi = mass / (height / 100)^2) %>% 
  group_by(species) %>% 
  summarise(
    mean_bmi = median(bmi, na.rm = TRUE) %>% round(1), 
    n = n()
  ) %>%
  filter(n >2) %>%
  arrange(desc(n))
species mean_bmi n
Human 24.8 35
Droid 34.4 6
NA 15.1 4
Gungan 16.8 3