R as a calculator and data types


  • The following topics will be covered in this lecture:

    • How to use R as a calculator
    • Variables and data types
    • Vectors and vectorization

R as a calculator

  • R accepts a set of human-readable instructions and converts these into machine language.

  • R can be used simply as a powerful calculator, for example:

    • if we enter a mathematical expression into an R console, we can evaluate mathematical expressions,
1 + 1
[1] 2

R as a calculator – continued

  • R uses standard mathematical notations for its operations, and follows the standard mathematical order of precedence:

  • Parentheses

(1 + 1)
[1] 2
  • Exponents
(1 + 1)^2
[1] 4
  • Division
(1 + 1)^2 / 4
[1] 1

R as a calculator – continued

  • Multiplication
(1 + 1)^2 / 4 * 3
[1] 3
  • Addition
(1 + 1)^2 / 4 * 3 + 1
[1] 4
  • Subtraction
(1 + 1)^2 / 4 * 3 + 1 - 2
[1] 2

R as a calculator – continued

  • R also has many standard built-in mathematical functions and variables, e.g.,
[1] 0
[1] -1
[1] 1.224647e-16
  • The notation “ae-16” refers to the mathematical expression \( a \times 10^{-16} \), where \( a \) is the leading coefficient.

  • Notice that R doesn't see \( sin(\pi) \) as zero, as it is mathematically, but is extremely small.

  • This has to do with the way in which numbers are encoded into programming languages – this will be discussed further shortly.

[1] "double"

Comparing things

  • Not all values in the computing language are numeric, and not all numerical values are built the same.

  • Consider the comparison operator “==” for evaluating if two inputs are the same,

sin(pi) == 0
0 == 0
[1] TRUE
  • We can also compare if two inputs are not the same,
1 != 2
[1] TRUE

Comparing things – continued

  • Notice that the outputs of the earlier comparisons are either “TRUE” or “FALSE” – these are examples of logical values, which are the output of logical expressions.
[1] "logical"
  • We can also compare the relative size of different values
1 > 2
2 >= 2
[1] TRUE
-1 <= 0
[1] TRUE

Variables and assignment

  • Values such as the output of different expressions can be assigned a variable name,
my_variable <- 2 + 2
  • In the above expression, the operator “<-” tells R to associate the output of the expression \( 2 +2 \) to “my_variable”.
[1] 4
  • We can show the current variables in the environment using the command “ls()”
[1] "my_variable"

Variables and assignment – continued

  • We can re-assign a value to “my_variable” which will be stored in the environment and memory,
my_variable <- my_variable + my_variable
[1] 8
  • Notice that the right hand side of the assignment operator “<-” is always evaluated first, then the assignment is given.

    • In this case, as above, we can recursively define a variable.

Variables and assignment – continued

  • Key to writing “good” code is to use good variable naming (and commenting).

    • Generally, it is preferable to name variables with something descriptive, e.g.,
mean_sea_surface_temp <- 10
  • For longer names as above, we can use e.g.,

    • underscores;
    • periods; or
mean.sea.surface.temp <- 10
  • capital letters.
meanSeaSurfaceTemp <- 10
  • All the above are commonly used conventions and all are acceptable — the key is to be clear and consistent in your code.

Variables and assignment – continued

  • Q: which of the following do you think are acceptable names for R variables?
  • A: the only ones that are not acceptable are
  • This is because R will not accept a leading underscore, a leading number or a dash in the name.

    • Note: however, that a leading period in “.mass” creates a “hidden” variable, which you typicall will not want.


  • R is a vectorized language, meaning that variables and functions can have vectors as values.

  • A vector in R describes a set of values in a certain order of the same data type.

    • The type of data will become increasingly important as we start using vectors.
  • A simple way to construct a vector is with the constructor function “c()”

c(1, 3, 6)
[1] 1 3 6

Vectorization – continued

  • The function takes an arbitrary number of elements as above, and creates a vector.
my_variable <- c(TRUE, pi)
[1] 1.000000 3.141593
  • Notice that the output of the above expression looks different from the input — this is because R forces vectors to have data of a single type:
[1] "double"
  • Here, the value “TRUE” has been forced into its numeric counterpart “1”.

Vectorization – continued

  • In the last example, we saw that a logical value “TRUE” was forced into a numeric value by the constructor function.

  • This variable “coercion” occurs in various situations, and we need to be careful with the results.

  • Q: what do you expect the result of the following to be?

1 == TRUE
  • A:
1 == TRUE
[1] TRUE
[1] "double"
[1] "logical"

Vectorization – continued

  • Vectors are built by definition with an order of the data that is stored — data can be accessed by calling this index:
[1] 1
[1] 3.141593
  • Mathematical operations can also be performed on vectors when their arguments accept vectors, and they can be applied element-wise on the vector entries:
[1] 8.414710e-01 1.224647e-16

Vectorization – continued

  • Certain functions allow us to construct vectors automatically based on a range of values, known as a “slice”
my_variable <- 1:5
[1] 1 2 3 4 5
  • We can make a general slice where the arguments are given as a:b and returns a vector of all integer spaced values between a and b:
[1] 10  9  8  7  6  5
[1]  4  5  6  7  8  9 10
  • This is often quite useful for extracting a subset of data from a large vector or matrix.

Vectorization – continued

  • We can also apply a mathematical operation to a scalar element-wise by the entries of a vector
[1]  2  4  8 16 32
  • Or use a vector as the index of a vector
[1] 2 3
  • This likewise goes for logical, comparison operators.

  • Q: what do you expect to be the output of the following line?

1:10 > 5
  • A:

Vectorization – continued

  • Note that logical vectors are also useful for extracting subsets of data.

    • Particularly, we may wish to set up a statement that we wish to evaluate on the data and find all data points that satisfy the condition.
my_variable <- 1:10
my_index <- my_variable>5
[1]  6  7  8  9 10
  • We might also have non-numeric vectors, such as
my_variable <- c('red', 'blue', 'green')
[1] "red"   "blue"  "green"
  • For such a vector, a logical statement can also be quite useful,
my_index <- my_variable == 'red'
[1] "red"