08/26/2020
Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
The following topics will be covered in this lecture:
This course will lean heavily on programming;
This course does not assume that you are already familiar with programming;
Students are recommended to use the lessons in Sofware Carpentry as a free reference for scientific programming in R.
In the following we will go through a tour of RStudio.
To follow along with this video tutorial, you need to download both of
As a prerequisite to this course, you need to have access to a computer where you can use R and RStudio, as well as download data and install packages.
An essential element of this class is to use modern statistical software to solve real-world problems.
In our activities in class, and in the project assignments, you will be expected to exercise basic skills in R for statistical analysis, including documenting your work.
We will now begin the tutorial in RStudio.
install.packages('faraway')
The “install.packages()” function will initiate an installation of the library with the package manager.
When a library has already been installed, but we want to use it in our environment, we can simply call
require(faraway)
?install.packages
??install.packages
R accepts a set of human-readable instructions and converts these into machine language.
R can be used simply as a powerful calculator, for example:
1 + 1
[1] 2
R uses standard mathematical notations for its operations, and follows the standard mathematical order of precedence:
Parentheses
(1 + 1)
[1] 2
(1 + 1)^2
[1] 4
(1 + 1)^2 / 4
[1] 1
(1 + 1)^2 / 4 * 3
[1] 3
(1 + 1)^2 / 4 * 3 + 1
[1] 4
(1 + 1)^2 / 4 * 3 + 1 - 2
[1] 2
log(1)
[1] 0
cos(pi)
[1] -1
sin(pi)
[1] 1.224647e-16
The notation “ae-16” refers to the mathematical expression \( a \times 10^{-16} \), where \( a \) is the leading coefficient.
Notice that R doesn't see \( sin(\pi) \) as zero, as it is mathematically, but is extremely small.
This has to do with the way in which numbers are encoded into programming languages – this will be discussed further shortly.
typeof(sin(pi))
[1] "double"
Not all values in the computing language are numeric, and not all numerical values are built the same.
Consider the comparison operator “==” for evaluating if two inputs are the same,
sin(pi) == 0
[1] FALSE
0 == 0
[1] TRUE
This shows one of the dangers of trusting computer arithmetic to be exact – because sin(pi))
is a floating point, double precision approximation, the comparision operator doesn't recognize it to be equal to zero.
If you want to compare more accurately two R values, the better approach is to use
all.equal(sin(pi), 0)
[1] TRUE
1 != 2
[1] TRUE
sin(pi) != 0
[1] TRUE
typeof(TRUE)
[1] "logical"
1 > 2
[1] FALSE
2 >= 2
[1] TRUE
-1 <= 0
[1] TRUE
my_variable <- 2 + 2
my_variable
[1] 4
ls()
[1] "my_variable"
my_variable = my_variable + my_variable
my_variable
[1] 8
Notice that the right hand side of the assignment operator “<-” is always evaluated first, then the assignment is given.
Key to writing “good” code is to use good variable naming (and commenting).
mean_sea_surface_temp <- 10
For longer names as above, we can use e.g.,
mean.sea.surface.temp <- 10
meanSeaSurfaceTemp <- 10
R is a vectorized language, meaning that variables and functions can have vectors as values.
A vector in R describes a set of values in a certain order of the same data type.
A simple way to construct a vector is with the constructor function “c()”
c(1, 3, 6)
[1] 1 3 6
my_variable <- c(TRUE, pi)
my_variable
[1] 1.000000 3.141593
typeof(my_variable)
[1] "double"
In the last example, we saw that a logical value “TRUE” was forced into a numeric value by the constructor function.
This variable “coercion” occurs in various situations, and we need to be careful with the results.
Q: what do you expect the result of the following to be?
1 == TRUE
1 == TRUE
[1] TRUE
typeof(1)
[1] "double"
typeof(TRUE)
[1] "logical"
my_variable[1]
[1] 1
my_variable[2]
[1] 3.141593
sin(my_variable)
[1] 8.414710e-01 1.224647e-16
my_variable <- 1:5
my_variable
[1] 1 2 3 4 5
10:5
[1] 10 9 8 7 6 5
4:10
[1] 4 5 6 7 8 9 10
2^my_variable
[1] 2 4 8 16 32
my_variable[2:3]
[1] 2 3
This likewise goes for logical, comparison operators.
Q: what do you expect to be the output of the following line?
1:10 > 5
1:10>5
[1] FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE TRUE TRUE
Note that logical vectors are also useful for extracting subsets of data.
my_variable <- 1:10
my_index <- my_variable>5
my_variable[my_index]
[1] 6 7 8 9 10
my_variable <- c('red', 'blue', 'green')
my_variable
[1] "red" "blue" "green"
my_index <- my_variable == 'red'
my_variable[my_index]
[1] "red"