7 Function list

7.1 Week 1

Function Explanation Section
install.packages() Installs a package. Typically used once to install new packages. 0.4.5
library() Loads an installed package. Packages should be loaded with every new R session and inside your RMarkdown documents to ensure proper knitting. 0.4.5
+, -, *, /, ^ Basic operators used for arithmetic in R. The order of operations is BEDMAS. 1.5.1.1
sqrt() Computes the square root of a number. 1.5.2
abs() Calculates the absolute value of a number. 1.5.2
round() Round a number to a specified amount of digits. 1.5.2.1
log() Computes the (natural) logarithm of a number. 1.5.2.4
rep() Replicates elements (numbers, vectors, lists) a specified amount of times. 1.5.2.4
print() Print the value of a variable to the console. 1.5.4
seq() Generate a sequence of numbers from a starting value to an end value by a set increment. 1.5.3.3
# Indicates the start of a comment in R code. Everything after # is ignored by R. Comments enable you to document your R code with human readable text. 1.5.4

7.2 Week 2

Function Explanation Section
General functions
read_csv() Load a .csv file into R as a data frame. 2.2
head() Print the first 6 rows of a data frame to the console. 2.2.1
tail() Print the last 6 rows of a data frame to the console. 2.2.1
glimpse() Print a brief summary of a data frame to the console. 2.2.1
length() Calculate the length of a vector. 2.3.1 and 2.5.1.2
sum() Calculate the sum of all values in a vector. 2.5.1.1
c() Create a vector (combine elements). A vector is a variable that holds multiple elements. 2.4.2.3
data.frame() Create a data frame from variables. 2.5.4
Data wrangling
$ Extract a single column/variable from a data frame. 2.2.1
%>% The pipe operator allows you to chain multiple operations on a data frame (without saving intermediates). The pipe operator can be read as “and then.” 2.3.1
filter() Keep only those rows of a data frame that satisfy a certain condition, usually based on values in a specific column. 2.4.2.2
group_by() Divide a data frame into groups based on the values in a given column. If given two column names, groups based on the first column are divided into subgroups based on the second column. Operations on grouped data frames are performed by (sub)group. 2.3.1 and 2.5.4.1
summarise() Create a data frame with summary statistics. Combine with group_by() to calculate summary statistics for groups and subgroups. 2.3.1 and 2.5.4.1
%in% Match values in a vector. Can be used to filter a data frame for multiple values. 2.4.2.3
Visualization
ggplot() Create a plot of the data in a data frame. You can create various kinds of plots with a wide range of aesthetics based on the specification of ggplot layers. Layers are added by using the + operator. 2.3.2 and 2.3.3
knittr::kable() Apply to a data frame to print out a nice looking table in your knitted document. 2.5.4.1
Descriptive statistics
mean() Calculate the mean of a vector. 2.5.2.1
median() Calculate the median of a vector. 2.5.2.2
range() Calculate the minimum and maximum values of a vector. 2.5.3.1
var() Calculate the variance of a vector. Variance is the sum of squares divided by \(n-1\). 2.5.3.2
sd() Calculate the standard deviation of a vector. This is equal to sqrt(var). 2.5.3.3

7.3 Week 3

Function Explanation Section
cor() Calculate the correlation between 2 vectors. 3.3.1
annotate() ggplot layer to annotate a plot with text and/or symbols 3.3.1.1
rnorm() Generate a vector of random numbers drawn from a normal distribution. Part of the *norm family of functions for the normal distribution. See the textbook for an explanation of the different distribution functions. 3.3.1.2
facet_wrap() Divide your plot into multiple facets based on a condition/variable. 3.3.1.2
select() Reduce a data frame to only the given columns 3.4.3

7.4 Week 4

Function Explanation Section
for (i in x:y) { } Programmatically loop over the code between { } (i.e. evaluate the code multiple times) from a start index (x) to end index (y). 4.2.1
sample() Take a random sample of the elements in a vector. 4.3.1
rbinom() Generate a vector of random numbers drawn from a binomial distribution. Part of the *binom family of functions for the binomial distribution. See the textbook for an explanation of the different distribution functions. 4.3.2
set.seed() Set the state of the random number generator in R. Typically used to get the same random sampling between R sessions. 4.4.4

7.5 Week 5

Function Explanation Section
replicate() Evaluate an expression multiple times. Analogous to rep() but for functions/expressions instead of vectors. 5.3.7
pivot_longer() Pivot data from wide format to long format. Tidyverse functions (like ggplot) expect long format data. 5.4
pivot_wider() Pivot data from long format to wide format. 5.4
mutate() Create a new variable from an existing one. 5.4
n() Calculate the number of observations in a group. 5.4

7.6 Week 6

Function Explanation Section
t.test() Performs one and two sample t-tests on vectors of data. 6.3