Introduction to R Programming

07/06/2017

R basics

Materials

You can find the materials for today's course on my website. Either go to the Moodle page for MY591 and click on the links, or:

Assignment: jackblumenau.com/assignment.html
Slides: jackblumenau.com/introtor.html

It is worth noting that both the assignment and the slides were all created using R and RStudio. So was my website.

Beat that, STATA…

I have used STATA for years, why learn R?

Because:

R is free
R is open source and has excellent package support and development
R is (easily) customisable
R is very good for data visualisation
R can accomodate more than one data matrix in memory at the same time
You are about to learn some methods that are basically only available in R

On the other hand:

R has a steep(er) learning curve

R and RStudio

As in STATA, you can either write commands directly into the console, or use a script (do file) to save commands and then run them
It is perfectly possible to use the base R environment for everything, but most people do not do this
Today we will mostly be using RStudio, which is a graphical user interface (GUI) for R
RStudio has a host of useful features which make it an excellent environment for learning R
- Syntax highlighting
- Easy data overview
- Looks a bit like STATA…

RStudio demonstration

Basic operations in R

1 + 1

## [1] 2

5 - 3

## [1] 2

6/2

## [1] 3

4*4

## [1] 16

Basic operations in R

Other typical mathematical functions are also hard-coded:

log(<number>)
exp(<number>)
sqrt(<number>)
mean(<numbers>)
sum(<numbers>)

Basic operations in R

R also understands logical operators:

< : less than
> : greater than
== : equal to (note, not =)
>= : greater than or equal to
<= : less than or equal to
!= : not equal to
& : and
| : or

R `objects`

R is an object-oriented programming language
The entities that R creates and manipulates are known as objects
These may be variables, arrays of numbers, character strings, functions
Objects are created using the assignment operator: <-
Once created, an object is stored in your current workspace

my_object <- 10
print(my_object)

## [1] 10

my_other_object <- 4
print(my_object - my_other_object)

## [1] 6

Data Structures

`vector` - the basic building block of R.

A vector is a collection of entites which all share the same type.

We use the c() function to concatenate observations into a single vector.

Numeric

num_vec <- c(5, 4, 2, 100, 7.65)
print(num_vec)

## [1]   5.00   4.00   2.00 100.00   7.65

Character

char_vec <- c("apple", "pear", "plumb", "pineapple", "strawberry")
print(char_vec)

## [1] "apple"      "pear"       "plumb"      "pineapple"  "strawberry"

`vector` - the basic building block of R.

Logical

log_vec <- c(FALSE, FALSE, FALSE, TRUE, TRUE)
print(log_vec)

## [1] FALSE FALSE FALSE  TRUE  TRUE

as.numeric(log_vec)

## [1] 0 0 0 1 1

`vector` - the basic building block of R.

Factor

A factor is similar to a character vector, but here each unique element is associated with a numerical value which represents a category:

fac_vec <- as.factor(c("a", "b","c","a","b","c"))
fac_vec

## [1] a b c a b c
## Levels: a b c

as.numeric(fac_vec)

## [1] 1 2 3 1 2 3

`vector` subsetting

To subset a vector, use square parenthesis to index the elements you would like via object[index]:

Numerical subsetting

char_vec

## [1] "apple"      "pear"       "plumb"      "pineapple"  "strawberry"

char_vec[1:3]

## [1] "apple" "pear"  "plumb"

`vector` subsetting

Logical subsetting

log_vec

## [1] FALSE FALSE FALSE  TRUE  TRUE

char_vec[log_vec]

## [1] "pineapple"  "strawberry"

`vector` operations

In R, mathematical operations on vectors occur elementwise:

fib <- c(1, 1, 2, 3, 5, 8, 13, 21)
fib[1:7]

## [1]  1  1  2  3  5  8 13

fib[2:8]

## [1]  1  2  3  5  8 13 21

fib[1:7] + fib[2:8]

## [1]  2  3  5  8 13 21 34

`vector` operations

It is also possible to perform logical operations on vectors:

fib <- c(1, 1, 2, 3, 5, 8, 13, 21)
fib_greater_five <- fib > 5
print(fib_greater_five)

## [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE

And we can combine logical operators

fib_greater_five_less_two <- fib > 5 | fib < 2
print(fib_greater_five_less_two)

## [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE  TRUE

`matrix`

A matrix is just a collection of vectors! I.e. it is a multidimensional vector where all elements are of the same type

my_matrix <- matrix(data = 1:100, nrow = 10, ncol = 10)
my_matrix

##       [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
##  [1,]    1   11   21   31   41   51   61   71   81    91
##  [2,]    2   12   22   32   42   52   62   72   82    92
##  [3,]    3   13   23   33   43   53   63   73   83    93
##  [4,]    4   14   24   34   44   54   64   74   84    94
##  [5,]    5   15   25   35   45   55   65   75   85    95
##  [6,]    6   16   26   36   46   56   66   76   86    96
##  [7,]    7   17   27   37   47   57   67   77   87    97
##  [8,]    8   18   28   38   48   58   68   78   88    98
##  [9,]    9   19   29   39   49   59   69   79   89    99
## [10,]   10   20   30   40   50   60   70   80   90   100

`data.frame`

A data.frame is a matrix-like R object in which the columns can be of different types

my_data_frame <- data.frame(numbers = num_vec, fruits = char_vec, logical = log_vec)
my_data_frame

##   numbers     fruits logical
## 1    5.00      apple   FALSE
## 2    4.00       pear   FALSE
## 3    2.00      plumb   FALSE
## 4  100.00  pineapple    TRUE
## 5    7.65 strawberry    TRUE

`matrix` and `data.frame` subsetting

To subset a matrix or data.frame, you need to specify both rows and columns:

my_matrix[1:3, 1:3]

##      [,1] [,2] [,3]
## [1,]    1   11   21
## [2,]    2   12   22
## [3,]    3   13   23

my_data_frame[1, ]

##   numbers fruits logical
## 1       5  apple   FALSE

`matrix` and `data.frame` subsetting

We can also subset to remove rows or columns that we do not want to see by using the - operator applied to the c function:

my_matrix[-c(1:3), -c(1:3)]

##      [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,]   34   44   54   64   74   84   94
## [2,]   35   45   55   65   75   85   95
## [3,]   36   46   56   66   76   86   96
## [4,]   37   47   57   67   77   87   97
## [5,]   38   48   58   68   78   88   98
## [6,]   39   49   59   69   79   89   99
## [7,]   40   50   60   70   80   90  100

In the code, 1:3 creates a vector of the integers 1, 2 and 3, and the - operator negates these. We wrap the vector in the c function so that - applies to each element, and not just the first element.

`list`

A list is a collection of any set of object types

my_list <- list(something = num_vec, 
                another_thing = my_matrix[1:3,1:3], 
                something_else = "jack")
my_list

## $something
## [1]   5.00   4.00   2.00 100.00   7.65
## 
## $another_thing
##      [,1] [,2] [,3]
## [1,]    1   11   21
## [2,]    2   12   22
## [3,]    3   13   23
## 
## $something_else
## [1] "jack"

`functions`

R makes extensive use of functions, which all have the same basic structure.

function_name(argument_one, argument_two, ...)

Where

function_name is the name of the function
argument_one is the first argument passed to the function
argument_two is the seccond argument passed to the function

`function` example

Let's consider the mean() function. This function takes two main arguments:

mean(x, na.rm = FALSE)

Where x is a numeric vector, and na.rm is a logical value that indicates whether we'd like to remove missing values (NA).

vec <- c(1,2,3,4,5)
mean(x = vec, na.rm = FALSE)

## [1] 3

vec <- c(1,2,3,NA,5)
mean(x = vec, na.rm = TRUE)

## [1] 2.75

`function` example

We can also perform calculations on the output of a function:

mean(vec) * 3

## [1] 9

Which means that we can also have nested functions:

sqrt(mean(vec))

## [1] 1.732051

We can also assign the output of any function to a new object for use later:

sqrt_mean_of_vec <- sqrt(mean(vec))

User defined `functions`

Functions are also objects, and we can create our own. We define a function as follows:

my_addition_function <- function(a, b){
  return(a + b)
}

my_addition_result <- my_addition_function(a = 5, b = 50)
print(my_addition_result)

## [1] 55

Note that the function itself, and the result returned by the function are both objects!

Help!

R has an inbuilt help facility which provides more information about any function:

help(summary)
?summary

The quality of the help varies hugely as some packages are well-maintained and others less so.

Alternative sources of help:

www.stackoverflow.com

www.google.com

Exercise One

Data manipulation and exploration

Reading data into R (.csv)

my_data <- read.csv(file = "my_file.csv")

my_data is an R data.frame object (you could call this anything)
my_file.csv is a .csv file with your data
<- is the assignment operator
In order for R to access my_file.csv, it will have to be saved in your current working directory
Use get_wd() to check your current working directory
Use set_wd() to change your current working directory

Reading data into R (.dta)

install.packages("foreign") ## install the "foreign" package
library(foreign) ## Loads an additional package that deals with STATA files
my_data <- read.dta(file = "my_file.dta")

The foreign package is not automatically installed in R so we need to do that ourselves
my_data is still an R data.frame object (you could still call this anything)
my_file.dta is a .dta file with your data
<- is still an assignment operator

Descriptive statistics functions

R includes a number of helpful functions for exploring your data

View(my_data) # View data in spreadsheet-style
names(my_data) # Names of the variables in your data
head(my_data) # First six rows of the data
tail(my_data) # Last six rows of the data
str(my_data) # "Structure" of any R object
summary(my_data) # Summary statistics for each of the columns in the data
dim(my_data) # Dimensions of a matrix
range(my_data$my_variable) # Range of a numeric vector
quantile(my_data$my_variable) # Quantiles of a numeric vector

`head`

head(my_data)

##            x         y         z g
## 1 -0.6254612 1.6362552 0.3989290 d
## 2 -1.3203506 0.5363096 0.1101269 e
## 3  1.7908249 0.2674716 0.5642690 f
## 4 -0.3554073 1.5243062 0.2920953 e
## 5  1.7603602 0.9080044 0.2631187 d
## 6 -0.2914242 0.8280291 0.1351559 b

`str`

str(my_data)

## 'data.frame':    1000 obs. of  4 variables:
##  $ x: num  -0.625 -1.32 1.791 -0.355 1.76 ...
##  $ y: num  1.636 0.536 0.267 1.524 0.908 ...
##  $ z: num  0.399 0.11 0.564 0.292 0.263 ...
##  $ g: Factor w/ 6 levels "a","b","c","d",..: 4 5 6 5 4 2 1 6 5 1 ...

`summary`

summary(my_data)

##        x                  y                 z             g      
##  Min.   :-3.41936   Min.   :-3.0830   Min.   :0.0008453   a:170  
##  1st Qu.:-0.64878   1st Qu.:-0.2858   1st Qu.:0.2352761   b:160  
##  Median : 0.05486   Median : 0.4243   Median :0.4825389   c:160  
##  Mean   : 0.03154   Mean   : 0.4274   Mean   :0.4853983   d:171  
##  3rd Qu.: 0.73181   3rd Qu.: 1.1314   3rd Qu.:0.7299549   e:156  
##  Max.   : 3.74557   Max.   : 3.7938   Max.   :0.9987806   f:183

`pairs`

pairs(my_data)

`range`

Suppose you would like to find the range of a variable in your data.frame. The following will not work:

range(my_data)

## Error in FUN(X[[i]], ...): only defined on a data frame with all numeric variables

Right, we cannot take the range of the entire data! Some of the variables are not numeric, and therefore do not have a range…

`range`

Instead, we need to access one variable. There are two ways to do this.

We can subset using square parenthesis, or
We can use the $ operator and access the variable of interest by name

range(my_data[,1])

## [1] -3.419364  3.745572

range(my_data$x)

## [1] -3.419364  3.745572

Plots and graphs

Introduction

Plots are one of the great strenghts f R.
There are two main frameworks for plotting
Base R graphics
ggplot
Which should you use? It is a matter of preference!

Base R plots

The basic plotting syntax is very simple. plot(x_var, y_var) will give you a scatter plot:

plot(my_data$x, my_data$y)

Hmm, let's work on that.

Base R plots

The plot function takes a number of arguments (?plot for a full list). The fewer you specify, the uglier your plot:

plot(x = my_data$x, y = my_data$y, 
     xlab = "X variable", 
     ylab = "Y variable", 
     main = "Awesome plot title",
     pch = 19, # Solid points
     cex = 0.5, # Smaller points
     bty = "n", # Remove surrounding box
     col = as.factor(my_data$g) # Colour by grouping variable
     )

Base R plots

The default behaviour of plot() depends on the type of input variables for the x and y arguments. If x is a factor variable, and y is numeric, then R will produce a boxplot:

plot(x = my_data$g, y = my_data$x)

Base R plots

`ggplot`

Also popular is the ggplot2 library. This is a separate package (i.e. it is not a part of the base R environment) but is very widely used. But not by me.

Based on the Grammar of Graphics data visualisation scheme
Graphs are broken into scales and layers
Has slightly idiosyncratic language style!

`ggplot`

Let's recreate the previous scatter plot using ggplot:

library(ggplot2)
ggplot(data = my_data, aes(x= x, y= y, col = g)) + 
    geom_point() +
    xlab("X variable")+
    ylab("Y variable")+
    ggtitle("ggplot title")+
    theme(panel.background = element_rect("white"))

`ggplot`

One nice feature of ggplot is that it is very easy to create facet plots:

library(ggplot2)
ggplot(data = my_data, aes(x= x, y= y, col = g)) + 
    geom_point() +
    xlab("X variable")+
    ylab("Y variable")+
    ggtitle("ggplot facet title")+
    theme(panel.background = element_rect("white"))+
    facet_wrap(~ g)

`ggplot`

Other examples of pretty R graphics

Exercise Two

Linear Regression Models

`lm`

Linear regression models in R are implemented using the lm function.

my_lm <- lm(formula = y ~ x, data = my_data)

The formula argument is the specification of the model, and the data argument is the data on which you would like the model to be estimated.

print(my_lm)

## 
## Call:
## lm(formula = y ~ x, data = my_data)
## 
## Coefficients:
## (Intercept)            x  
##      0.4178       0.3046

`lm`

We can specify multivariate models:

my_lm_multi <- lm(formula = y ~ x + z, data = my_data)

Interaction models:

my_lm_interact <- lm(formula = y ~ x * z, data = my_data)

Fixed-effect models:

my_lm_fe <- lm(formula = y ~ x + g, data = my_data)

And many more!

`lm`

The output of the lm function is a long list of interesting output.

When we call print(saved_model), we are presented with the estimated coefficients, and nothing else.

For some more information of the estimated model, use summary(saved_model):

my_lm_summary <- summary(my_lm)
print(my_lm_summary)

`lm`

## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.6246 -0.6960 -0.0202  0.7016  3.2367 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.41777    0.03209  13.020   <2e-16 ***
## x            0.30463    0.03238   9.407   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.014 on 998 degrees of freedom
## Multiple R-squared:  0.08145,    Adjusted R-squared:  0.08053 
## F-statistic:  88.5 on 1 and 998 DF,  p-value: < 2.2e-16

`lm`

As with any other function, summary(saved_model) returns an object. Here, it is a list. What is saved as the output of this function?

names(my_lm_summary)

##  [1] "call"          "terms"         "residuals"     "coefficients" 
##  [5] "aliased"       "sigma"         "df"            "r.squared"    
##  [9] "adj.r.squared" "fstatistic"    "cov.unscaled"

If we want to extract other information of interest from the fitted model object, we can use the $ operator to do so:

print(my_lm_summary$r.squared)

## [1] 0.08145418

`lm`

Accessing elements from saved models can be very helpful in making comparisons across models:

my_lm_r2 <- summary(my_lm)$r.squared
my_lm_multi_r2 <- summary(my_lm_multi)$r.squared
my_lm_interact_r2 <- summary(my_lm_interact)$r.squared

r2.compare <- data.frame(
                  model = c("bivariate", "multivariate", "interaction"), 
                  r.squared = c(my_lm_r2, 
                         my_lm_multi_r2, 
                         my_lm_interact_r2))

`lm`

We can print the values:

print(r2.compare)

##          model  r.squared
## 1    bivariate 0.08145418
## 2 multivariate 0.08552710
## 3  interaction 0.09811655

Or we can plot them:

ggplot(r2.compare, aes(x = model, y = r.squared))+
  geom_point(size = 4)+
  ggtitle("R^2 Comparison")

`lm`

`lm` diagnostics

There are a number of functions that are helpful in producing model diagnostics:

residuals(saved_model) extracts the residuals from a fitted model
coefficients(saved_model) extracts coefficients
fitted(saved_model) extracts fitted values
plot(saved_model) is a convenience function for producing a number of useful diagnostics plots

`lm` residual plot

We can easily plot the residuals from a fitted model against an explanatory variable of interest:

par(mfrow = c(1,2)) # Divide the plotting region into 1 row and 2 cols
plot(x = my_data$x, y = residuals(my_lm_multi), 
     xlab = "x", ylab = "residuals",
     pch = 19, # Filled circles
     col="grey", # Change colour
     cex = 0.5) # Make point size smaller
abline(h = 0) # Add a horizontal line with y-intercept of 0
plot(x = my_data$z, y = residuals(my_lm_multi), 
     xlab = "z", ylab = "residuals",
     pch = 19, # Filled circles
     col="grey", # Change colour
     cex = 0.5) # Make point size smaller
abline(h = 0) # Add a horizontal line with y-intercept of 0

`lm` residual plot

Non-Linear Regression Models

`glm`

To estimate a range of non-linear models, the glm function is particularly helpful.

First, let us transform our outcome variable from a continuous measure to a binary measure:

my_data$y_bin <- as.numeric(my_data$y > median(my_data$y))
table(my_data$y_bin)

## 
##   0   1 
## 500 500

`glm`

Now we can estimate our model:

my_logit <- glm(formula = y_bin ~ x + z, data = my_data, family = "binomial")

Where:

formula is the model specification
data is the data
family is a description of the error distribution and link function to be used
- binomial, poisson, gaussian etc…

`glm`

summary(my_logit)

## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.77744  -1.13471  -0.03837   1.14638   1.85331  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -0.17079    0.12745  -1.340     0.18    
## x            0.40330    0.06793   5.937  2.9e-09 ***
## z            0.32624    0.22678   1.439     0.15    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 1386.3  on 999  degrees of freedom
## Residual deviance: 1346.9  on 997  degrees of freedom
## AIC: 1352.9
## 
## Number of Fisher Scoring iterations: 4

`glm`

OK, but no-one actually thinks in terms of log-odds, so let's translate that into something more meaningful.

my_logit_OR <- exp(cbind(OR = coef(my_logit), confint(my_logit)))

coef extracts the coefficients
confint extracts the confidence intervals
cbind binds the vectors together as separate columns
exp exponentiates the log-odds ratios

round(my_logit_OR, digits = 4)

##                 OR  2.5 % 97.5 %
## (Intercept) 0.8430 0.6563 1.0819
## x           1.4968 1.3122 1.7129
## z           1.3858 0.8889 2.1636

`glm`

Almost all of the convenience functions that we used for lm are also applicable to glm models:

summary(my_logit)
plot(my_logit)
residuals(my_logit)
coefficients(my_logit)
fitted(my_logit)

Other models

There are a number of external packages that can make fitting other model types (relatively) straightforward:

lmer4 - Linear, generalised linear, and nonlinear mixed models
mcgv - generalised additive models
survival - survival analysis
glmnet - lasso and elastic net regression models
randomForest - random forest models from machine learning
rjags and rstan - Bayesian models

To use a package that is not a part of the base R installation, use:

install.packages("survival")
library(survival)

`predict`

We can retreive the fitted values from the model using fitted(), but we may be interested in calculating predicted values for arbitrary levels of our covariates.

sim_data <- data.frame(x = c(0, 1))
y_hat <- predict(object = my_lm, newdata = sim_data)
y_hat

##         1         2 
## 0.4177653 0.7223981

Here, I am creating a data.frame with two observations of one variable (x).

I am then using the predict function, where

object = my_lm tells R the model object for which prediction is desired
newdata = sim_data tells R the values for which I would like predictions

`predict`

We can use the same syntax to retreive predictions for (marginally) more interesting models:

sim_data <- data.frame(x = c(0, 0, 1, 1), z = c(0, 1, 0, 1))
sim_data

##   x z
## 1 0 0
## 2 0 1
## 3 1 0
## 4 1 1

y_hat <- predict(my_lm_multi, newdata = sim_data)
y_hat

##         1         2         3         4 
## 0.3027153 0.5397412 0.6072847 0.8443107

`predict`

This can be especially useful when trying to visualise interactive models:

sim_data_z0 <- data.frame(x = seq(from = -2, to = 2, by = 0.01), z = 0)
sim_data_z1 <- data.frame(x = seq(from = -2, to = 2, by = 0.01), z = 1)
y_hat_z0 <- predict(my_lm_interact, newdata = sim_data_z0)
y_hat_z1 <- predict(my_lm_interact, newdata = sim_data_z1)

seq generates a regular sequences from one value to another value by given increments
I am creating two data.frames for prediction, in both cases varying the value of x, but first setting z to 0, and then setting z to 1

`predict`

# Create a plot of the data
plot(my_data$x, my_data$y, cex = 0.5, pch = 19, 
     col = "gray", bty = "n",
     xlab = "X", ylab = "Y", 
     main = "Fitted values for sim_data")

# Add a prediction line for z = 0
lines(x = sim_data_z0$x, y = y_hat_z0, lwd = 2)

# Add a prediction line for z = 1
lines(x = sim_data_z1$x, y = y_hat_z1, lwd = 2, col = "red")

# Add a legend
legend("bottomright", legend = c("z = 0", "z = 1"), col = c("black", "red"), lty = 1)

`predict`

Can we just use `margins`?

Yes! Thomas Leeper has created a nice port of STATA's margins command:

library(margins)

cplot(my_lm_interact, x = "x", dx = "z", what = "effect")

cplot is equivalent to marginsplot in STATA.

The specification above tell R that we want to plot the marginal effect of z on y across the range of x.

Can we just use `margins`?

Thank you!

Any questions?

Other helpful functions

objects() & ls() # Which objects are currently loaded in my working environment?
rm() # Remove objects from my current environment
save() # Save R object(s) to disk
is.na() # Is this object a missing value? Or, which elements in this vector are missing?
rnorm() # Generate random numbers from a normal distribution
runif() # Generate random numbers from a uniform distribution

Other helpful packages

library(data.table)
library(dplyr)
library(plyr)
library(zoo)
library(reshape2)
library(shiny)

R basics

Materials

I have used STATA for years, why learn R?

R and RStudio

RStudio demonstration

Basic operations in R

Basic operations in R

Basic operations in R

R objects

Data Structures

vector - the basic building block of R.

vector - the basic building block of R.

vector - the basic building block of R.

vector subsetting

vector subsetting

vector operations

vector operations

matrix

data.frame

matrix and data.frame subsetting

matrix and data.frame subsetting

list

functions

function example

function example

User defined functions

Help!

Exercise One

Data manipulation and exploration

Reading data into R (.csv)

Reading data into R (.dta)

Descriptive statistics functions

head

str

summary

pairs

range

range

Plots and graphs

Introduction

Base R plots

Base R plots

Base R plots

Base R plots

Base R plots

ggplot

ggplot

ggplot

ggplot

ggplot

Other examples of pretty R graphics

Other examples of pretty R graphics

Exercise Two

Linear Regression Models

lm

lm

lm

lm

lm

lm

lm

lm

lm diagnostics

lm residual plot

lm residual plot

Non-Linear Regression Models

glm

glm

glm

glm

glm

Other models

predict

predict

predict

predict

predict

Can we just use margins?

Can we just use margins?

Thank you!

R `objects`

`vector` - the basic building block of R.

`vector` - the basic building block of R.

`vector` - the basic building block of R.

`vector` subsetting

`vector` subsetting

`vector` operations

`vector` operations

`matrix`

`data.frame`

`matrix` and `data.frame` subsetting

`matrix` and `data.frame` subsetting

`list`

`functions`

`function` example

`function` example

User defined `functions`

`head`

`str`

`summary`

`pairs`

`range`

`range`

`ggplot`

`ggplot`

`ggplot`

`ggplot`

`ggplot`

`lm`

`lm`

`lm`

`lm`

`lm`

`lm`

`lm`

`lm`

`lm` diagnostics

`lm` residual plot

`lm` residual plot

`glm`

`glm`

`glm`

`glm`

`glm`

`predict`

`predict`

`predict`

`predict`

`predict`

Can we just use `margins`?

Can we just use `margins`?