Calculation and Variables#
R as a calculator#
Using R at its most basic, it is a calculator. You can enter a calculation into the console and immediately evaluate the result.
# R is a calculator
1 + 2
3 * 4
Variables#
A core concept in programming, a variable is essentially a named piece of data. That is, when you refer to the variable by its name within the program, you are actually referring to the data stored under that name.
To assign data to a variable in R, use the following syntax:
# Assignment
x <- 2
y <- 3.5
After you have assigned data to variables, you can then use the variables to perform calculations:
# R is a clever calculator
x + 2
x * y
z <- x + x + x
If you need to see the value of a variable in the command line, you can just type its name:
# What is x?
x
Note that variable names are case sensitive, and cannot start with a number.
Exercises#
Experiment for yourself with the R command line to do some simple calculations
Assign some different numbers to the variables x and y and check if calculations with them work as you expect
Try to do a calculation with a variable you haven’t assigned any data to, a for instance
Set x to 1, then check what happens when you run the calculation x <- x + 1, what value is x afterwards?
Be aware that R has special values for certain calculations - try dividing by 0 for instance.
Types#
In R, and many other languages, variables also have a type, which defines the sort of data they store. R is actually a bit more complicated because it has modes and classes:
A mode is most like a type in other languages, and determines the type of data stored, such as ‘numeric’ or ‘character’.
A class is a container that describes how the data is arranged and tells functions how to work with the data.
Some modes you might encounter:
numeric - numbers, including integers and floating points numbers
character - strings
logical - TRUE or FALSE
list - a special mode for containing multiple items of any, possibly different, mode(s), whose mode becomes ‘list’
Some classes you might encounter:
vector - a one-dimensional set of items of the same mode
matrix - a multidimensional set of items of the same mode
data.frame - a two-dimensional table with columns of different modes
formula - a declaration of how variables are related to each other, for fitting models
factor - a categorical variable
The reason that it is sometimes important to know what mode and class your variable has, is that functions behave differently according to the data they are given. It’s easy to accidentally transform your variable into an unexpected format and then get an unexpected result from the functions you use in your program.
Mode detection#
To a certain extent, R will auto-detect what mode a variable should have based on the data. There are convenient functions to check a variable’s mode when you need to.
# Auto-detection of variable mode
x <- 1
y <- "word"
mode(x)
mode(y)
# What about if we make a mistake
x <- "1"
is.numeric(x)
Vectors#
If we want to create a variable that contains multiple pieces of data, we must make a declaration when we assign data to the variable.
# Creating a vector
x <- c(1, 2, 3)
x
# Lazy sequences
x <- 1:3
x
# Creating a vector with variables
x <- 1
y <- 2
z <- c(x, y, 3)
z
Exercises#
Create a vector containing the numbers 1 to 10
What happens if you add 1 to this variable?
What happens if you multiple the variable by 2?
What happens if you add the variable to itself?
Now create two vectors of the same length containing different numbers, say 1 to 3 and 4 to 6.
What happens when you add or multiply these together?
What happens if you add or multiply two vectors of different lengths?