Calculation and Variables

R as a calculator

Using R at its most basic, it is a calculator. You can enter a calculation into the console and immediately evaluate the result.

# R is a calculator
1 + 2
3 * 4

Variables

A core concept in programming, a variable is essentially a named piece of data. That is, when you refer to the variable by its name within the program, you are actually referring to the data stored under that name.

To assign data to a variable in R, use the following syntax:

# Assignment
x <- 2
y <- 3.5

After you have assigned data to variables, you can then use the variables to perform calculations:

# R is a clever calculator
x + 2
x * y
z <- x + x + x

If you need to see the value of a variable in the command line, you can just type its name:

# What is x?
x

Note that variable names are case sensitive, and cannot start with a number.

Exercises

  • Experiment for yourself with the R command line to do some simple calculations

  • Assign some different numbers to the variables x and y and check if calculations with them work as you expect

  • Try to do a calculation with a variable you haven’t assigned any data to, a for instance

  • Set x to 1, then check what happens when you run the calculation x <- x + 1, what value is x afterwards?

  • Be aware that R has special values for certain calculations - try dividing by 0 for instance.

Types

In R, and many other languages, variables also have a type, which defines the sort of data they store. R is actually a bit more complicated because it has modes and classes:

  • A mode is most like a type in other languages, and determines the type of data stored, such as ‘numeric’ or ‘character’.

  • A class is a container that describes how the data is arranged and tells functions how to work with the data.

Some modes you might encounter:

  • numeric - numbers, including integers and floating points numbers

  • character - strings

  • logical - TRUE or FALSE

  • list - a special mode for containing multiple items of any, possibly different, mode(s), whose mode becomes ‘list’

Some classes you might encounter:

  • vector - a one-dimensional set of items of the same mode

  • matrix - a multidimensional set of items of the same mode

  • data.frame - a two-dimensional table with columns of different modes

  • formula - a declaration of how variables are related to each other, for fitting models

  • factor - a categorical variable

The reason that it is sometimes important to know what mode and class your variable has, is that functions behave differently according to the data they are given. It’s easy to accidentally transform your variable into an unexpected format and then get an unexpected result from the functions you use in your program.

Mode detection

To a certain extent, R will auto-detect what mode a variable should have based on the data. There are convenient functions to check a variable’s mode when you need to.

# Auto-detection of variable mode
x <- 1
y <- "word"

mode(x)
mode(y)

# What about if we make a mistake
x <- "1"

is.numeric(x)

Vectors

If we want to create a variable that contains multiple pieces of data, we must make a declaration when we assign data to the variable.

# Creating a vector
x <- c(1, 2, 3)
x

# Lazy sequences
x <- 1:3
x

# Creating a vector with variables
x <- 1
y <- 2
z <- c(x, y, 3)
z

Exercises

  • Create a vector containing the numbers 1 to 10

  • What happens if you add 1 to this variable?

  • What happens if you multiple the variable by 2?

  • What happens if you add the variable to itself?

  • Now create two vectors of the same length containing different numbers, say 1 to 3 and 4 to 6.

  • What happens when you add or multiply these together?

  • What happens if you add or multiply two vectors of different lengths?

+ show/hide code