1. Vector subsetting with []
- Create a numeric vector
x <- c(10, 20, 30, 40, 50) and a named version x_named with names c("a","b","c","d","e").
- Select the 2nd and 4th elements using numeric indexing
- Reorder the vector
x <- c(10, 20, 30, 40, 50) with c(5,1,5,3).
- Select elements
"b" and "e" from x_named using character indexing.
- Select all elements of
x greater than 25 using a logical condition. How many elements were selected? Use a function to verify.
- Use negative indices to drop the first and last element of
x.
2. Subsetting data frames: columns
- Create a data frame
dat <- data.frame(id = 1:5, group = factor(c("A","A","B","B","B")), score = c(12,18,22,19,25)).
- Select column
score using a numeric index and then by name with (single) square brackets. Confirm the result is a data frame (not a vector) and explain why.
- Provide a vector of indices to select two columns (
id and score) in one operation.
- Rename columns to
c("ID","Group","Score") using names() and show the names of the renamed data.frame.
3. Subsetting data frames: rows & conditions
- Select all rows where Score is greater or equal 20 using a logical condition inside
[].
- Use
which() to obtain the row indices for Group == "B" and subset dat with these indices.
- Use
with(dat, ...) to express a condition of your choice without repeatedly writing dat$.
- Create a new column
pass with ifelse() that has logical values for Score >= 20.
4. Using subset()
- Recreate Task 3.1 using
subset().
- Recreate Task 2.3 by selecting only columns
ID and Score via subset().
- Create a one-line expression that returns the IDs of the top two scores in
dat (ties allowed). Hint: order() can help.