Goals

  • Basic concepts of the R language
  • Object types: numbers, logic values, characters
  • object structures: vectors, factors, data frames, lists

Functions

  • With a function you command the computer to do something.
  • functions have a function name (e.g., mean, sqrt).
  • functions take arguments to specify what to do.
  • arguments have argument names as well.
  • functions always consist of a function name followed by brackets.

function_name(argument_name_1 = value, argument_2 = value, ...)

Examples

sqrt() calculates the square root

sqrt(x = 16)
[1] 4

You can omit the argument name if it is the first argument

sqrt(16)
[1] 4

Even without arguments you still need the brackets

date()
[1] "Thu Oct 30 15:24:22 2025"

Comments

It is good practice to add comments and notes to your code.
Everything that is written behind a # will not be executed as code.
If you want a comment to span across several line, you have to begin each line with a # symbol.

# This is a comment
# across multiple lines

Within RStudio, you can use comments to create headers by ending a comment line with four - signs:

# section 2 ----

You can jump through your code by the list a the bottom of your source panel or, after activating the outline panel, at the right of your source panel:

Help files

# Function
help("sqrt")

# Short cut
?sqrt

… or use the bottom-right help panel in R Studio

help("mean")
?mean

Operations

Operations are a special kind of functions that have a shortcut.

# function `assign` and the short cut
assign(x = "y", value = 10)

y <- 10

.

# function `+`
"+"(e1 = 10, e2 = 10)

10 + 10

.

# function `print`
print(x = y)

y

Objects

Objects have an object name and contain data.
The data are assigned to an object with the <- or = operator.

x <- 10

You can see the value(s) of an object with the print() function, or by just typing the object name:

print(x)
x

Objects can be used for operators and arguments in functions:

x <- 16
y <- 13

x * y
sqrt(x)

You can write the return values of a function into a new object:

z <- sqrt(x)
z

And you can combine these:

exp(z) + sqrt(y)

Data types

The data of objects can be numbers, text or TRUE/FALSE values. These are called data types

  • Numeric: e.g. Integer or decimal numbers 1, 1.35
  • Character: Always between ” ” or ’ ’ signs: "Doctor", 'House'
  • Logical: TRUE, FALSE
x <- 10
y <- "Hello world!"
z <- FALSE

Data structures

Data are organized in structures:

  • Vectors: A series of elements of the same data type.
  • List: A series of elements, each element can be of any data type or data structure.
  • Data Frames: A list with one vector for each element and all vectors of the same length
  • Matrix: A two dimensional table with values of the same data type.
  • Array: Like a matrix but with more dimensions.

Data structures

Data are organized in structures:

  • Vectors: A series of elements of the same data type.
  • List: A series of elements, each element can be of any data type or data structure.
  • Data Frames: A list with one vector for each element and all vectors of the same length.
  • Matrix: A two dimensional table with values of the same data type.
  • Array: Like a matrix but with more dimensions.

How to build a vector

You create a vector with the c() function:

c(2, 4, 6, 3, 7)
[1] 2 4 6 3 7
y <- c(2, 4, 6, 3, 7)
y
[1] 2 4 6 3 7

The colon : operator creates a numerical sequence:

1:10
 [1]  1  2  3  4  5  6  7  8  9 10

This is a shortcut for seq(1, 10)

You can build a vector of any data type:

firstname <- c("Dustin", "Mike", "Will")
curly <- c(TRUE, FALSE, FALSE)
age <- c(9, 11, 10)

But do not mix data types in a vector. You will get an error or they are internally changed to fit one data type:

age <- c("quite young", 10, 12, "very old")
age
[1] "quite young" "10"          "12"          "very old"   

Here 10 and 12 are changed to a character data type "10" and "12".

Combining vectors to new vectors

When an object is a vector it can be reused within the c() function to build a new vector:

x <- c(3, 5, 7)
c(x, 5, 8, 9)
[1] 3 5 7 5 8 9

Combining vectors to new vectors

Be careful not to confuse an object name with a character:

x <- c("A", "B", "C")
c("x", "D", "E", "F")
[1] "x" "D" "E" "F"
c(x, "D", "E", "F")
[1] "A" "B" "C" "D" "E" "F"
c(A, B, C)
Error: object 'A' not found

Missing values

A missing value is represented with NA (Not Available).

age <- c(9, NA, 11)
name <- c("Tick", "Trick", NA)
age
[1]  9 NA 11
name
[1] "Tick"  "Trick" NA     

Named vectors

A named vector is a vector with a name for each element:

age <- c(James = 34, Hella = 30, Armin = 43)
age
James Hella Armin 
   34    30    43 
glasses <- c(James = TRUE, Hella = FALSE, Armin = TRUE)
glasses
James Hella Armin 
 TRUE FALSE  TRUE 

You get and set the names of a named object with the names() argument:

names(age)
[1] "James" "Hella" "Armin"
names(age) <- c("Judith", "Jerom", "Klaus")
age
Judith  Jerom  Klaus 
    34     30     43 

Converting vectors

the as functions convert vectors between data types:

as.character(1:5)
[1] "1" "2" "3" "4" "5"
as.numeric(c(FALSE, TRUE, FALSE))
[1] 0 1 0
as.logical(c(0,1,0,1,1))
[1] FALSE  TRUE FALSE  TRUE  TRUE
as.numeric(c("4711", "0814", "007"))
[1] 4711  814    7

But unexpected results may occur:

as.numeric(c("1", "2", "3.1", "3,2"))
[1] 1.0 2.0 3.1  NA
as.logical(c(1,0,1,0,1,3))
[1]  TRUE FALSE  TRUE FALSE  TRUE  TRUE

A factor

A factor is a vector with labels for levels.
A factor is created with the factor() function.
The levels argument defines the possible factor levels.
The labels argument defines the corresponding labels.

Example:

sen <- factor(
  c(1, 0, 1, 0, 0, 0), 
  levels =  c(0, 1, 2), 
  labels = c("Without_SEN", "With_SEN", "unclear")
)
sen
[1] With_SEN    Without_SEN With_SEN    Without_SEN Without_SEN Without_SEN
Levels: Without_SEN With_SEN unclear

How to build a data frame

Data frames are the standard object for storing research data. They contain variables (columns) and cases (rows). A data frame is created with the data.frame() function.

# For better convenience I have inserted additional line-breaks and spaces
study <- data.frame(
  sen    = c(0, 1, 0, 1, 0, 1),
  gender = c("M", "M", "F", "M", "F", "F"),
  age    = c(12, 13, 11, 10, 11, 14),
  IQ     = c(90, 85, 90, 87, 99, 89)
)
study
sen gender age IQ
0 M 12 90
1 M 13 85
0 F 11 90
1 M 10 87
0 F 11 99
1 F 14 89

Extracting a variable from a data frame

Variables within a data frame are extracted with double square brackets.

study[["sen"]]
[1] 0 1 0 1 0 1
study[["IQ"]]
[1] 90 85 90 87 99 89

An alternative approach is to use the $ sign:

study$sen
[1] 0 1 0 1 0 1

Extracting a variable with a variable

You can use an object to extract a variable:

variable <- "gender"
index <- 2
study[[variable]]
[1] "M" "M" "F" "M" "F" "F"
study[[index]]
[1] "M" "M" "F" "M" "F" "F"

How to construct a list

Lists are the most versatile data structures in R and are very important for understanding R.

A list is a series of elements with arbitrary data types and structures. A list is constructed with the list() function

list(1:3, "Hallo!", TRUE)
[[1]]
[1] 1 2 3

[[2]]
[1] "Hallo!"

[[3]]
[1] TRUE

It is best to name list elements:

my_list <- list(
  numbers = 1:3,  
  string = "Hallo!", 
  logical = TRUE
)
my_list
$numbers
[1] 1 2 3

$string
[1] "Hallo!"

$logical
[1] TRUE

Extracting list elements

You can extract a list element with [[ or $ signs:

my_list[["numbers"]]
[1] 1 2 3
my_list[[1]]
[1] 1 2 3
my_list$numbers
[1] 1 2 3

lists can be very complex with lists nested in lists:

complex_list <- list(
  list_in_list = list(A = 1, B = 1:3),
  list_in_list_of_a_list = list(C = list(D = 4), E = 5)
)
complex_list
$list_in_list
$list_in_list$A
[1] 1

$list_in_list$B
[1] 1 2 3


$list_in_list_of_a_list
$list_in_list_of_a_list$C
$list_in_list_of_a_list$C$D
[1] 4


$list_in_list_of_a_list$E
[1] 5

The str() function returns the structure of an R object

str(complex_list)
List of 2
 $ list_in_list          :List of 2
  ..$ A: num 1
  ..$ B: int [1:3] 1 2 3
 $ list_in_list_of_a_list:List of 2
  ..$ C:List of 1
  .. ..$ D: num 4
  ..$ E: num 5