Version: 10 November 2020

- arrow keys: move through slides
- f: toggle full-screen

Sophisticated selections

Relational operators

Relational operators compare two values and return a logical value (TRUE or FALSE)

Operator Relation Example
== is identical x == y
!= is not identical x != y
> is greater x > y
>= is greater or identical x >= y
< is less x < y
<= is less or identical x <= y

Examples

7 > 2
## [1] TRUE
7 <=  10
## [1] TRUE
5 == 4
## [1] FALSE
5 != 6
## [1] TRUE

Relational vectors and characters

Only == and != can be applied to non numerical objects:

"Hamster" == "Mouse"
## [1] FALSE
"Hamster" != "Mouse"
## [1] TRUE

Relational operators and vectors

age <- c(12, 4, 3, 8, 4, 2, 1)
age < 5
## [1] FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE
age age < 5
12 FALSE
4 TRUE
3 TRUE
8 FALSE
4 TRUE
2 TRUE
1 TRUE

Using logical vectors to select values

When you put a logical vector within square brackets [ ] after an object, all elements of that object with a TRUE in the logical vector are selected:

age <- c(12, 4, 3, 8)
x <- age > 5
x
## [1]  TRUE FALSE FALSE  TRUE
age[x]
## [1] 12  8

Using logical vectors to select values

age <- c(12, 4, 3, 8)
x <- age > 5
age[x]
age x Select? Result
12 TRUE select 12
4 FALSE drop
3 FALSE drop
8 TRUE select 8

Task

Create a new vector friends <- c(4, 5, 6, 3, 7, 2, 3). Show all values of that vector >= 4.

Please stop the video here! Continue after completing the task!

Task - solution

Create a new vector friends <- c(4, 5, 6, 3, 7, 2, 3). Show all values of that vector >= 4.

friends <- c(4, 5, 6, 3, 7, 2, 3)
friends[friends >= 4]
## [1] 4 5 6 7

which()

The which() functions gives the indices of the elements that are TRUE.
It takes a logical vector as an argument.

x <- c(TRUE, FALSE, FALSE, TRUE)
which(x)
## [1] 1 4
age <- c(12, 4, 3, 8)
x <- age < 5
x
## [1] FALSE  TRUE  TRUE FALSE
which(x)
## [1] 2 3

age <- c(12, 4, 3, 8)
x <- age < 5
x
which(x)
Index age x <- age < 5 which(x)
1 12 FALSE
2 4 TRUE 2
3 3 TRUE 3
4 8 FALSE

Task

Create a vector x <- c(1, 4, 5, 3, 4, 5) and identify:
1. Which elements are larger or equal than three?
2. Create a new vector from x containing all elements that are not four. Note: Use the which() function for this task.

Please stop the video here! Continue after completing the task!

Task - solution

Create a vector x <- c(1, 4, 5, 3, 4, 5) and identify:
1. Which elements are larger or equal than three?
2. Create a new vector from x containing all elements that are not four. Note: Use the which() function for this task.

x <- c(1, 4, 5, 3, 4, 5)
which(x >= 3)
## [1] 2 3 4 5 6
y <- x[which(x != 4)]
y
## [1] 1 5 3 5

Selecting cases with logical vectors

Logical vectors can also be appplied to data frames for selecting cases:

# Either directly:
study_no_sen <- study[study[["sen"]] == 0, ]
study_no_sen
sen gender age IQ
1 0 M 12 90
3 0 F 11 90
5 0 F 11 99
# Or using the which() function
filter <- which(study[["sen"]] == 0)
study_no_sen <- study[filter, ]

Task

Calculate the mean of IQ for students with and without sen.

Please stop the video here! Continue after completing the task!

Task - solution

Calculate the mean of IQ for students with and without sen.

filter <- which(study[["sen"]] == 0)
mean(study[["IQ"]][filter])
## [1] 93
filter <- which(study[["sen"]] == 1)
mean(study[["IQ"]][filter])
## [1] 87

Logical Operations

Logical operations are applied to logical values.

Operator Operation Example Results
! Not ! x TRUE when x = FALSE and FALSE when x = TRUE
& AND x & y TRUE when x and y are TRUE else FALSE
| OR x | y TRUE when x or y is TRUE else FALSE

Note: To get the | sign:
On a german Mac keyboard press: option + 7
On a german Windows keyboard press: AltGr + <

Example

x <- TRUE
y <- FALSE
!x
## [1] FALSE
!y
## [1] TRUE
x & y
## [1] FALSE
x | y
## [1] TRUE

Logical Operator with vectors

When applied to two vectors, logical operations result in a new vector.
Operations are applied to each element one by one.

x <- c(TRUE, FALSE, TRUE,  FALSE)
y <- c(TRUE, FALSE, FALSE, TRUE)
!x
## [1] FALSE  TRUE FALSE  TRUE
x & y
## [1]  TRUE FALSE FALSE FALSE
x | y
## [1]  TRUE FALSE  TRUE  TRUE

Task

Create two vectors:

glasses <- c(TRUE, TRUE, FALSE, TRUE, FALSE)
hyperintelligent <- c(TRUE, FALSE, FALSE, TRUE, FALSE)

Determine for each element whether ‘glasses’ and ‘hyperintelligent’ are TRUE at the same time.

Please stop the video here! Continue after completing the task!

Task - solutions

Create two vectors:

glasses <- c(TRUE, TRUE, FALSE, TRUE, FALSE) hyperintelligent <- c(TRUE, FALSE, FALSE, TRUE, FALSE)

Determine for each element whether ‘glasses’ and ‘hyperintelligent’ are TRUE at the same time.

glasses <- c(TRUE, TRUE, FALSE, TRUE, FALSE)
hyperintelligent <- c(TRUE, FALSE, FALSE, TRUE, FALSE)
glasses & hyperintelligent
## [1]  TRUE FALSE FALSE  TRUE FALSE

glasses hyperintelligent glasses & hyperintelligent
TRUE TRUE TRUE
TRUE FALSE FALSE
FALSE FALSE FALSE
TRUE TRUE TRUE
FALSE FALSE FALSE

Combining logical and relational operators

age <- c(12, 4, 3, 8, 4, 2, 1, 7, 4)
gender <- c(0, 1, 0, 1, 0, 0, 0, 0, 1)
age > 4
## [1]  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE  TRUE FALSE
gender == 0
## [1]  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE
age > 4 & gender == 0
## [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE

Task

Create a vector
income <- c(5000, 4000, 3000, 2000, 1000) and a vector
happiness <- c(20, 35, 30, 10, 50).
Use relational and logical operations to determine for each element whether the income is larger than 2500 and at the same time happiness is above 25.

Please stop the video here! Continue after completing the task!
income <- c(5000, 4000, 3000, 2000, 1000)
happiness <- c(20, 35, 30, 10, 50)
income > 2500 & happiness > 25
## [1] FALSE  TRUE  TRUE FALSE FALSE

Task - solution

Create a vector
income <- c(5000, 4000, 3000, 2000, 1000) and a vector
happiness <- c(20, 35, 30, 10, 50).
Use relational and logical operations to determine for each element whether the income is larger than 2500 and at the same time happiness is above 25.

income <- c(5000, 4000, 3000, 2000, 1000)
happiness <- c(20, 35, 30, 10, 50)
income > 2500 & happiness > 25
income happiness income > 2500 happiness > 25 income > 2500 &
happiness > 25
5000 20 TRUE FALSE FALSE
4000 35 TRUE TRUE TRUE
3000 30 TRUE TRUE TRUE
2000 10 FALSE FALSE FALSE
1000 50 FALSE TRUE FALSE

Subsetting data frames with logical and relational operators

study
sen gender age IQ
0 M 12 90
1 M 13 85
0 F 11 90
1 M 10 87
0 F 11 99
1 F 14 89

filter <- study[["sen"]] == 1 & study[["gender"]] == "M"
study[filter, ]
sen gender age IQ
2 1 M 13 85
4 1 M 10 87

Task

Use the ChickWeight data frame for the following task.
The data set is already included in R.

  1. Look into the data set with ?ChickWeight.
  2. Get all variable names of the data frame with the names() function (names(ChickWeight)).
  3. Select cases from ChickWeight with Diet == 1 and Time < 16.
  4. For these cases, calculate the correlation between weight and Time. Note: Use the cor() function (e.g., cor(x, y))
  5. Repeat steps 3 and 4 for Diet == 4.
  6. What can you see?
Please stop the video here! Continue after completing the task!

filter <- ChickWeight[["Diet"]] ==  1 & ChickWeight[["Time"]] < 16
diet1 <- ChickWeight[filter,]
cor(diet1[["weight"]], diet1[["Time"]])
## [1] 0.8109772
filter <- ChickWeight[["Diet"]] ==  4 & ChickWeight[["Time"]] < 16
diet4 <- ChickWeight[filter,]
cor(diet4[["weight"]], diet4[["Time"]])
## [1] 0.9720822