Use the left and right arrow keys to navigate the presentation forward and backward respectively. You can also use the arrows at the bottom right of the screen to navigate with a mouse.
FAIR USE ACT DISCLAIMER: This site is for educational purposes only. This website may contain copyrighted material, the use of which has not been specifically authorized by the copyright holders. The material is made available on this website as a way to advance teaching, and copyright-protected materials are used to the extent necessary to make this class function in a distance learning environment. The Fair Use Copyright Disclaimer is under section 107 of the Copyright Act of 1976, allowance is made for “fair use” for purposes such as criticism, comment, news reporting, teaching, scholarship, education and research.
Often when we're coding we want to control the flow of our actions – this often occurs when we write a script with detailed instructions.
This can be done by setting actions to occur only if a condition or a set of conditions are met.
There are several ways you can control flow in R.
if
and else
:# if
if (condition is true) {
perform action
}
# if ... else
if (condition is true) {
perform action
} else { # that is, if the condition is false,
perform alternative action
}
This kind of binary logic is at the heart of classical (non-quantum) computing and used effectively can be used to create rich sets of commands.
if
and else
can be chained together to handle a wide variety of cases, and even to handle when we encounter errors.
x
has a particular value:x <- 8
if (x >= 10) {
print("x is greater than or equal to 10")
}
x
[1] 8
else
statement.x <- 8
if (x >= 10) {
print("x is greater than or equal to 10")
} else {
print("x is less than 10")
}
[1] "x is less than 10"
else if
.x <- 8
if (x >= 10) {
print("x is greater than or equal to 10")
} else if (x > 5) {
print("x is greater than 5, but less than 10")
} else {
print("x is less than 5")
}
[1] "x is greater than 5, but less than 10"
TRUE
or FALSE
statements, if
, else if
and else
will equip us to handle complex cases.if()
statements, it is looking for a logical
element, i.e., TRUE
or FALSE
. x <- 4 == 3
if (x) {
"4 equals 3"
} else {
"4 does not equal 3"
}
[1] "4 does not equal 3"
FALSE
x <- 4 == 3
x
[1] FALSE
Note: the if()
function only accepts singular (of length 1) inputs, and therefore returns an error when you use it with a standard vector.
if()
function will still run, but will only evaluate the condition in the first element of the vector. To use the if()
function, you need to make sure your input is singular (of length 1).
The in ifelse()
function in R
accepts both if()
and else()
statements simultaneously as structured in the previous example.
This function accepts both singular and vector inputs and is structured as follows:
# ifelse function
ifelse(condition is true, perform action, perform alternative action)
The first argument is the condition or a set of conditions to be met;
the second argument is the statement that is evaluated when the condition is TRUE
;
and the third statement is the statement that is evaluated when the condition is FALSE
.
Consider the following example of the ifelse
function;
Q: can you hypothesize what will be the output of this statement?
y <- -3
ifelse(y < 0, "y is a negative number", "y is either positive or zero")
[1] "y is a negative number"
In many instances in data analysis, we will need to repeatedly perform some operation.
This could be as simple as repeatedly opening a long list of files, taking out a line of data that is needed from each, and compiling all the data into a dataframe.
More complex analysis also often requires complex instructions to be delivered to software in R.
If you want to iterate over a set of values, when the order of iteration is important, and perform the
same operation on each, a for()
loop will do the job.
However, for performance for()
loops should be avoided unless the order of iteration is
important:
If the order of iteration is not important, then vectorized alternatives, such as the purr
package, should be used whenever possible.
for
loopsfor()
loop is:for (iterator in set of values) {
do a thing
}
for (i in 1:10) {
print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
1:10
bit creates a vector on the fly; you can iterate over any other vector as well.for()
loop nested within another for()
loop to iterate over two things at once.for (i in 1:5) {
for (j in c('a', 'b', 'c', 'd', 'e')) {
print(paste(i,j))
}
}
[1] "1 a"
[1] "1 b"
[1] "1 c"
[1] "1 d"
[1] "1 e"
[1] "2 a"
[1] "2 b"
[1] "2 c"
[1] "2 d"
[1] "2 e"
[1] "3 a"
[1] "3 b"
[1] "3 c"
[1] "3 d"
[1] "3 e"
[1] "4 a"
[1] "4 b"
[1] "4 c"
[1] "4 d"
[1] "4 e"
[1] "5 a"
[1] "5 b"
[1] "5 c"
[1] "5 d"
[1] "5 e"
We notice in the output that when the first index (i
) is set to 1, the second index (j
) iterates through its full set of indices.
Once the indices of j
have been iterated through, then i
is incremented. This process continues until the last index has been used for each for()
loop.
Rather than printing the results, we could write the loop output to a new object.
output_vector <- c()
for (i in 1:5) {
for (j in c('a', 'b', 'c', 'd', 'e')) {
temp_output <- paste(i, j)
output_vector <- c(output_vector, temp_output)
}
}
output_vector
from the last slide thus prints asoutput_vector
[1] "1 a" "1 b" "1 c" "1 d" "1 e" "2 a" "2 b" "2 c" "2 d" "2 e" "3 a" "3 b"
[13] "3 c" "3 d" "3 e" "4 a" "4 b" "4 c" "4 d" "4 e" "5 a" "5 b" "5 c" "5 d"
[25] "5 e"
The last approach can be useful, but 'growing your results' (building the result object incrementally) is computationally inefficient,
Computers are very bad at handling this efficiently, so your calculations can very quickly slow to a crawl.
It's much better to define an empty results object before hand of appropriate dimensions, rather than initializing an empty object without dimensions.
If you know the end result will be stored in a matrix like above, create an empty matrix with 5 row and 5 columns, then at each iteration store the results in the appropriate location.
A better way is to define your (empty) output object before filling in the values.
For this example, it looks more involved, but is still more efficient.
output_matrix <- matrix(nrow=5, ncol=5)
j_vector <- c('a', 'b', 'c', 'd', 'e')
for (i in 1:5) {
for (j in 1:5) {
temp_j_value <- j_vector[j]
temp_output <- paste(i, temp_j_value)
output_matrix[i, j] <- temp_output
}
}
output_vector2 <- as.vector(output_matrix)
output_vector2
[1] "1 a" "2 a" "3 a" "4 a" "5 a" "1 b" "2 b" "3 b" "4 b" "5 b" "1 c" "2 c"
[13] "3 c" "4 c" "5 c" "1 d" "2 d" "3 d" "4 d" "5 d" "1 e" "2 e" "3 e" "4 e"
[25] "5 e"
Sometimes you will find yourself needing to repeat an operation as long as a certain condition is met, or until some condition is met, but may not be met until after an unknown number of iterations.
You can do this with a while()
loop.
while(this condition is true){
do a thing
}
R will interpret a condition being met as “TRUE”.
A loop can be written to perform an action for a “FALSE” condition, until it is becomes “TRUE” by using the logical inverse !
.
while(! this condition is False){
do a thing
}
As an example, here's a while loop that generates random numbers from a uniform distribution (the runif()
function)
between 0 and 1 until it gets one that's less than 0.1.
We will set a random seed set for reproducibility of the analysis.
set.seed(10)
z <- 1
while(z > 0.1){
z <- runif(1)
cat(z, "\n")
}
0.5074782
0.3067685
0.4269077
0.6931021
0.08513597
z>1
), the seed means that each time we re-run the code we will get the same (pseudo)-random result.Now that we have learned a bit about while()
loops, lets consider the following question
Q: what do you think will be the output of the below chunk of code?
z <- 1
while( z>= 1){
z <- z+1
cat(z, "\n")
}
A: the loop will never terminate based on its logical condition, as z
\( \geq 1 \) for every iteration.
z
becomes too large to store in memory. This shows how while()
loops will not always be appropriate.
You have to be particularly careful that you don't end up stuck in an infinite loop because your condition is always met and hence the while statement never terminates.
In this case we have an obvious error with this loop, but these conditions can be much more subtle and therefore we need to think about how they are satisfied.