Version: 10 November 2020

- arrow keys: move through slides
- f: toggle full-screen

Extending R

libraries, packages, and repositories

  • A “fresh” R installation already contains hundreds of functions.
  • Functions are organized in libraries.
  • Libraries address a certain topic or area (e.g., graphics, a specific statstical method)
  • A package is a ‘container’ for distributing and sharing libraries.
  • You can add additional packages to extend you R installation.
  • Additional packages are provided in repositories or in seperate files.
  • Repositories are online data storages.
  • The most important repository for R is CRAN (The Comprehensive R Archive Network)

Installing and activating new packages

  • You find a list of all CRAN packages on
    https://cran.r-project.org/ Packages
  • You get an overview of all functions within a package with the help() or short ? function.
  • You can directly install a package from CRAN with the install.packages() function.

After successful installation, add on packages have to be activated and loaded into memory in each R sesssion with the library() function.
Note: You only install once, but you use library() each time you restart R or R Studio.

Task

Install the packages psych and tidyverse.
Activate both packages with
library(tidyverse)
library(psych)

install.packages(c("psych", "tidyverse"))

R Studio projects

R-Studio projects

As soon as you have more tham one source file and/or external data, it makes sense to start a project instead of just using single source files.

  • A project is a feature of R Studio, not of R.
  • A project always hosted in a folder on your harddrive.
  • All scripts, data, and other files are stored in that folder.
  • When later opening a project, the working directory is directly set to the folder location.
Working directory:

The place on your harddrive R will save and load data from by default (i.e. when no other place is explicitly set). Use the getwd() and setwd() functions to get and set the working directory.

Starting an R-Studio project

You can start a project from R studio through:

  1. File New Project …
  2. Now choose whether you already have a folder you like to start a project in or you create a new empty folder for an R project.
  3. Choose New R Project as the project type.
  4. Choose a directory name and start the project.

Task

  • Create a new R project with a name of your choice (e.g. ‘R_course’).
  • Copy all your R scripts related to this R course into that new project folder.
  • Close and reopen R Studio
  • Open the project through:
    • File Recent projects
    • Or the project menue in the uper right corner of R Studio

External data

Importing a data set from Excel

  • The read_excel() function from the readxl package (included in tidyverse) is used to import files created by Microsoft Excel.
  • Alternatively: R Studio provided an easy way to import data:
    File Import Dataset From Excel
  • But if you want to have a full script that runs by itself, I recommend to use the R functions.
  • Store your data within your R project folder.
  • If you do not install it there, you need to know the folder location to load it into R.

Example

library(readxl)
dat <- read_xlsx("cars.xlsx")
names(dat) # this function shows the variable names of a data frame 
##  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
## [11] "carb"
mpg cyl disp hp drat wt qsec vs am gear carb
21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
22.8 4 108 93 3.85 2.320 18.61 1 1 4 1

Task

  • Download the Excel file “cars.xlsx” from the moodle course.
  • Save it in your project directory.
  • Import the data set and assign it to an object dat.
  • Apply the View() function to see the dataset.

Note: View() opens a new tab in RStudio with the content of a data frame (e.g. View(dat)).

Task

  • Calculate the mean of mpg (miles per gallon) for cars with 4, 6, and 8 cylinders (variable cly).
Please stop the video here! Continue after completing the task!

Task - solution

  • Calculate the mean of mpg (miles per gallon) for cars with 4, 6, and 8 cylinders (variable cly).
dat_4 <- dat[dat[["cyl"]] == 4, "mpg"]
dat_6 <- dat[dat[["cyl"]] == 6, "mpg"]
dat_8 <- dat[dat[["cyl"]] == 8, "mpg"]
mean(dat_4)
mean(dat_6)
mean(dat_8)
## [1] 26.66364
## [1] 19.74286
## [1] 15.1