University of Münster
2025-10-30
A ggplot graphic has at least three key components:
displ displacement by hwy highway miles per gallon:-)
displ displacement and hwy highway miles per gallondrv f = front-wheel drive, r = rear wheel drive, 4 = 4wdcyl number of cylinders:-)
ggplot() functionThe main function is ggplot(). It takes two arguments:
data : A data framemapping : Aesthetic mappings provided with the aes() function.Additional layers are added with a + sign.
mpg data frame.cty and hwy displayed on the axis andclass andshape = drv) is mapped on the variable drv.geom_point() layer.:-)
geom function.geom_point() : Dots for each data point.geom_line() : Lines connecting each x-axis data pointgeom_bar() : Barsgeom_text() : Text at x and y positionsgeom_smooth() : Smoothed conditional meanseconomics data framedate and unemployment. (geom_line())geom_point()):-)
geom_bar()geom_bar() draws barsx variablempg data frame.drv variable.red with the fill argument.width = 0.8 to resize the bar width.:-)
geom_col()geom_col() function, bar heights and bar categories are taken from the x and y variables:starwars database.bmi < 100.Hint:
:-)
geom_smooth()geom_smooth() is used to add smoothed conditional means in scatterplots.
economics data frame.unemploy by population pop.geom_smooth layer.:-)
dslabs.gapminder.year and continent.summarize() function to calculate the mean of infant_mortality.year on x-axis, mean of infant_mortality on y-axis, and continent as line/dot colours.smooth layer.Hint: group_by(year, continent)
:-)
When you have multiple values ordered in a categorical variable simple plots become messy:
Solutions
geom_jitter() : Adds a litle random jitter to each datapointgeom_boxplot() : Draws a boxplotgeom_violin() : Draws a violine plotmgp dataset:-)
Facets are another basic aesthetics. A plot is organizes multiple times by a categorical variable:
year by extracting the year from the date variable.psavert in to a categorical variable saving_rate with three levelssaving_rate.Hint1: year = format(date, format = "%Y")
Hint2: saving_rate = cut(psavert, breaks = 3, labels = c("Low","Medium","High"))
Hint3: unemployment rate: unemploy / pop * 100
:-)
Jürgen Wilbert - Introduction to R