A package tailed towards the manipulation of data frames
University of Münster
2026-01-29
filter() picks cases based on their values.select() picks variables based on their names.mutate() adds and creates new variablessummarise() reduces multiple values down to a single summary.group_by() allows to perform any operation by grouping variables.arrange() changes the ordering of the rows.rellocate() changes the ordering of the columns.filter() picks cases based on their valuesfilter(.data, ...)
Note: You don’t have to provide the name of a data object within the dplyr function!
not: filter(mtcars, mtcars$disp > 350 & mtcars$mpg < 11)
but: filter(mtcars, disp > 350 & mpg < 11)
filter is an extension of logical subsetting in base R:
starwars database.brown hair color and female gender (Hint: variables hair_color and sex)| name | height | mass | hair_color | skin_color | eye_color | birth_year | sex | gender | homeworld | species | films | vehicles | starships |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Leia Organa | 150 | 49 | brown | light | brown | 19 | female | feminine | Alderaan | Human | A New Hope, The Empire Strikes Back, Return of the Jedi, Revenge of the Sith, The Force Awakens | Imperial Speeder Bike | |
| Beru Whitesun Lars | 165 | 75 | brown | light | blue | 47 | female | feminine | Tatooine | Human | A New Hope, Attack of the Clones, Revenge of the Sith | ||
| Padmé Amidala | 185 | 45 | brown | light | brown | 46 | female | feminine | Naboo | Human | The Phantom Menace, Attack of the Clones, Revenge of the Sith | Naboo fighter, H-type Nubian yacht, Naboo star skiff | |
| Dormé | 165 | NA | brown | light | brown | NA | female | feminine | Naboo | Human | Attack of the Clones | ||
| Rey | NA | NA | brown | light | hazel | NA | female | feminine | NA | Human | The Force Awakens |
select() picks variables based on their namesselect(.data, ...)
Note: selectis an extension of variable subsetting in base R:
starwars database.Human and select the variables name, height, and massspecies| name | height | mass |
|---|---|---|
| Luke Skywalker | 172 | 77.0 |
| Darth Vader | 202 | 136.0 |
| Leia Organa | 150 | 49.0 |
| Owen Lars | 178 | 120.0 |
| Beru Whitesun Lars | 165 | 75.0 |
| Biggs Darklighter | 183 | 84.0 |
| Obi-Wan Kenobi | 182 | 77.0 |
| Anakin Skywalker | 188 | 84.0 |
| Wilhuff Tarkin | 180 | NA |
| Han Solo | 180 | 80.0 |
| Wedge Antilles | 170 | 77.0 |
| Palpatine | 170 | 75.0 |
| Boba Fett | 183 | 78.2 |
| Lando Calrissian | 177 | 79.0 |
| Lobot | 175 | 79.0 |
| Mon Mothma | 150 | NA |
| Arvel Crynyd | NA | NA |
| Qui-Gon Jinn | 193 | 89.0 |
| Finis Valorum | 170 | NA |
| Padmé Amidala | 185 | 45.0 |
| Ric Olié | 183 | NA |
| Quarsh Panaka | 183 | NA |
| Shmi Skywalker | 163 | NA |
| Mace Windu | 188 | 84.0 |
| Cliegg Lars | 183 | NA |
| Dormé | 165 | NA |
| Dooku | 193 | 80.0 |
| Bail Prestor Organa | 191 | NA |
| Jango Fett | 183 | 79.0 |
| Jocasta Nu | 167 | NA |
| Raymus Antilles | 188 | 79.0 |
| Finn | NA | NA |
| Rey | NA | NA |
| Poe Dameron | NA | NA |
| Captain Phasma | NA | NA |
contains(): selects all variables which names contain a certain string.starts_with(): selects all variables which start with a certain string.ends_with(): selects all variables which end with a certain string.num_range(): selects variables with a certain prefix within a certain rangerelocate() to rearrange columnsThe relocate() function helps to rearrange the columns of a data frame
| name | birth_year | sex | height | mass | hair_color | skin_color | eye_color |
|---|---|---|---|---|---|---|---|
| Luke Skywalker | 19.0 | male | 172 | 77 | blond | fair | blue |
| C-3PO | 112.0 | none | 167 | 75 | NA | gold | yellow |
| R2-D2 | 33.0 | none | 96 | 32 | NA | white, blue | red |
| Darth Vader | 41.9 | male | 202 | 136 | none | white | yellow |
| Leia Organa | 19.0 | female | 150 | 49 | brown | light | brown |
| Owen Lars | 52.0 | male | 178 | 120 | brown, grey | light | blue |
| Beru Whitesun Lars | 47.0 | female | 165 | 75 | brown | light | blue |
| R5-D4 | NA | none | 97 | 32 | NA | white, red | red |
| Biggs Darklighter | 24.0 | male | 183 | 84 | black | light | brown |
| Obi-Wan Kenobi | 57.0 | male | 182 | 77 | auburn, white | fair | blue-gray |
mutate()adds new variables that are functions of existing variables
mutate(.data, ...)
mtcars %>%
mutate(
efficiency = mpg / disp,
effect = round(hp / mpg * 100)
) %>%
relocate(efficiency, effect)| efficiency | effect | mpg | cyl | disp | hp | drat | wt | qsec | vs | am | gear | carb | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mazda RX4 | 0.1312500 | 524 | 21.0 | 6 | 160.0 | 110 | 3.90 | 2.620 | 16.46 | 0 | 1 | 4 | 4 |
| Mazda RX4 Wag | 0.1312500 | 524 | 21.0 | 6 | 160.0 | 110 | 3.90 | 2.875 | 17.02 | 0 | 1 | 4 | 4 |
| Datsun 710 | 0.2111111 | 408 | 22.8 | 4 | 108.0 | 93 | 3.85 | 2.320 | 18.61 | 1 | 1 | 4 | 1 |
| Hornet 4 Drive | 0.0829457 | 514 | 21.4 | 6 | 258.0 | 110 | 3.08 | 3.215 | 19.44 | 1 | 0 | 3 | 1 |
| Hornet Sportabout | 0.0519444 | 936 | 18.7 | 8 | 360.0 | 175 | 3.15 | 3.440 | 17.02 | 0 | 0 | 3 | 2 |
| Valiant | 0.0804444 | 580 | 18.1 | 6 | 225.0 | 105 | 2.76 | 3.460 | 20.22 | 1 | 0 | 3 | 1 |
| Duster 360 | 0.0397222 | 1713 | 14.3 | 8 | 360.0 | 245 | 3.21 | 3.570 | 15.84 | 0 | 0 | 3 | 4 |
| Merc 240D | 0.1663258 | 254 | 24.4 | 4 | 146.7 | 62 | 3.69 | 3.190 | 20.00 | 1 | 0 | 4 | 2 |
| Merc 230 | 0.1619318 | 417 | 22.8 | 4 | 140.8 | 95 | 3.92 | 3.150 | 22.90 | 1 | 0 | 4 | 2 |
| Merc 280 | 0.1145585 | 641 | 19.2 | 6 | 167.6 | 123 | 3.92 | 3.440 | 18.30 | 1 | 0 | 4 | 4 |
| Merc 280C | 0.1062053 | 691 | 17.8 | 6 | 167.6 | 123 | 3.92 | 3.440 | 18.90 | 1 | 0 | 4 | 4 |
| Merc 450SE | 0.0594634 | 1098 | 16.4 | 8 | 275.8 | 180 | 3.07 | 4.070 | 17.40 | 0 | 0 | 3 | 3 |
| Merc 450SL | 0.0627266 | 1040 | 17.3 | 8 | 275.8 | 180 | 3.07 | 3.730 | 17.60 | 0 | 0 | 3 | 3 |
| Merc 450SLC | 0.0551124 | 1184 | 15.2 | 8 | 275.8 | 180 | 3.07 | 3.780 | 18.00 | 0 | 0 | 3 | 3 |
| Cadillac Fleetwood | 0.0220339 | 1971 | 10.4 | 8 | 472.0 | 205 | 2.93 | 5.250 | 17.98 | 0 | 0 | 3 | 4 |
| Lincoln Continental | 0.0226087 | 2067 | 10.4 | 8 | 460.0 | 215 | 3.00 | 5.424 | 17.82 | 0 | 0 | 3 | 4 |
| Chrysler Imperial | 0.0334091 | 1565 | 14.7 | 8 | 440.0 | 230 | 3.23 | 5.345 | 17.42 | 0 | 0 | 3 | 4 |
| Fiat 128 | 0.4116900 | 204 | 32.4 | 4 | 78.7 | 66 | 4.08 | 2.200 | 19.47 | 1 | 1 | 4 | 1 |
| Honda Civic | 0.4015852 | 171 | 30.4 | 4 | 75.7 | 52 | 4.93 | 1.615 | 18.52 | 1 | 1 | 4 | 2 |
| Toyota Corolla | 0.4767932 | 192 | 33.9 | 4 | 71.1 | 65 | 4.22 | 1.835 | 19.90 | 1 | 1 | 4 | 1 |
| Toyota Corona | 0.1790175 | 451 | 21.5 | 4 | 120.1 | 97 | 3.70 | 2.465 | 20.01 | 1 | 0 | 3 | 1 |
| Dodge Challenger | 0.0487421 | 968 | 15.5 | 8 | 318.0 | 150 | 2.76 | 3.520 | 16.87 | 0 | 0 | 3 | 2 |
| AMC Javelin | 0.0500000 | 987 | 15.2 | 8 | 304.0 | 150 | 3.15 | 3.435 | 17.30 | 0 | 0 | 3 | 2 |
| Camaro Z28 | 0.0380000 | 1842 | 13.3 | 8 | 350.0 | 245 | 3.73 | 3.840 | 15.41 | 0 | 0 | 3 | 4 |
| Pontiac Firebird | 0.0480000 | 911 | 19.2 | 8 | 400.0 | 175 | 3.08 | 3.845 | 17.05 | 0 | 0 | 3 | 2 |
| Fiat X1-9 | 0.3455696 | 242 | 27.3 | 4 | 79.0 | 66 | 4.08 | 1.935 | 18.90 | 1 | 1 | 4 | 1 |
| Porsche 914-2 | 0.2161264 | 350 | 26.0 | 4 | 120.3 | 91 | 4.43 | 2.140 | 16.70 | 0 | 1 | 5 | 2 |
| Lotus Europa | 0.3196635 | 372 | 30.4 | 4 | 95.1 | 113 | 3.77 | 1.513 | 16.90 | 1 | 1 | 5 | 2 |
| Ford Pantera L | 0.0450142 | 1671 | 15.8 | 8 | 351.0 | 264 | 4.22 | 3.170 | 14.50 | 0 | 1 | 5 | 4 |
| Ferrari Dino | 0.1358621 | 888 | 19.7 | 6 | 145.0 | 175 | 3.62 | 2.770 | 15.50 | 0 | 1 | 5 | 6 |
| Maserati Bora | 0.0498339 | 2233 | 15.0 | 8 | 301.0 | 335 | 3.54 | 3.570 | 14.60 | 0 | 1 | 5 | 8 |
| Volvo 142E | 0.1768595 | 509 | 21.4 | 4 | 121.0 | 109 | 4.11 | 2.780 | 18.60 | 1 | 1 | 4 | 2 |
starwars database.bmi. Hint: bmi = kg / m²name, bmi, height, and massstarwars %>%
mutate(bmi = mass/ (height/100)^2) %>%
filter(bmi >= 26 & species == "Human") %>%
select(name, bmi, height, mass)| name | bmi | height | mass |
|---|---|---|---|
| Luke Skywalker | 26.02758 | 172 | 77 |
| Darth Vader | 33.33007 | 202 | 136 |
| Owen Lars | 37.87401 | 178 | 120 |
| Beru Whitesun Lars | 27.54821 | 165 | 75 |
| Wedge Antilles | 26.64360 | 170 | 77 |
summarise()Reduces multiple values down to a single summary.
starwars database.Humansna.rm = TRUE)group_by() group a datasetgroup_by() allows to perform a summarise() operation by grouping variables.
starwars database.n())arrange()Orders the rows by the values of selected columns.
arrange(.data, ...)
desc()Specifies a variable to be arranged descending.
starwars database.n for each group.n > 2.n in descending order.Jürgen Wilbert - Introduction to R