DATA 220G — Week 1 - 02

R Refresher

Data Types

Vectors

A data structure that holds element of the same “mode”, “type”

"A single element character vector"
## [1] "A single element character vector"
class("A single element character vector")
## [1] "character"
mode("A single element character vector")
## [1] "character"
typeof("A single element character vector")
## [1] "character"
35L
## [1] 35
class(35L)
## [1] "integer"
mode(35L)
## [1] "numeric"
typeof(35L)
## [1] "integer"
41
## [1] 41
class(41)
## [1] "numeric"
mode(41)
## [1] "numeric"
typeof(41)
## [1] "double"
41.0
## [1] 41
class(41.0)
## [1] "numeric"
TRUE
## [1] TRUE
class(TRUE)
## [1] "logical"
mode(TRUE)
## [1] "logical"
typeof(TRUE)
## [1] "logical"

Going beyond single length vectors

c("This", "is", "a", "test")
## [1] "This" "is"   "a"    "test"
class(c("This", "is", "a", "test"))
## [1] "character"
length(c("This", "is", "a", "test"))
## [1] 4
c(35L, 10L, 20L, 11L)
## [1] 35 10 20 11
class(c(35L, 10L, 20L, 11L))
## [1] "integer"
length(c(35L, 10L, 20L, 11L))
## [1] 4
class(c(35L, 10L, 20, 11L))
## [1] "numeric"
c(TRUE, FALSE, 1, "twelve")
## [1] "TRUE"   "FALSE"  "1"      "twelve"

Keeping things around for a while

a_good_variable_name <- c("This", "is", "a", "test")

class(a_good_variable_name)
## [1] "character"
length(a_good_variable_name)
## [1] 4

A better variable name:

ages <- c(35L, 10L, 20L, 11L)

class(ages)
## [1] "integer"
length(ages)
## [1] 4

Indexing vectors

ages[1]
## [1] 35
ages[3]
## [1] 20
ages[c(1, 2)]
## [1] 35 10
ages[c(3, 1)]
## [1] 20 35
ages[1:3]
## [1] 35 10 20
ages[4:1]
## [1] 11 20 10 35
ages[c(4, 3, 2, 1)]
## [1] 11 20 10 35
ages[-1]
## [1] 10 20 11
ages[100]
## [1] NA
ages[0.1]
## integer(0)

Other operations on vectors

1 + 1
## [1] 2
3 * 2
## [1] 6
c(1, 2) + c(5, 6)
## [1] 6 8
c(1, 2) * 2
## [1] 2 4
c(1, 2) ^ 10
## [1]    1 1024
c(1, 2) - 1
## [1] 0 1

Not just numbers

nchar(a_good_variable_name)
## [1] 4 2 1 4

Not just “math”

ages < 100
## [1] TRUE TRUE TRUE TRUE
ages > 10
## [1]  TRUE FALSE  TRUE  TRUE
ages == 10 | ages == 35
## [1]  TRUE  TRUE FALSE FALSE
ages == 10 || ages == 35
## [1] TRUE

Indexing with logical vector results

ages[ages == 10 | ages == 35]
## [1] 35 10
ages[ages> 100]
## integer(0)
ages[TRUE]
## [1] 35 10 20 11
ages[c(TRUE, FALSE)]
## [1] 35 20
ages[c(TRUE, FALSE, TRUE)]
## [1] 35 20 11
ages[c(TRUE, FALSE, TRUE, TRUE)]
## [1] 35 20 11
ages[c(TRUE, TRUE)]
## [1] 35 10 20 11
ages[FALSE]
## integer(0)
ages[which(ages < 30)]
## [1] 10 20 11
which(ages < 30)
## [1] 2 3 4

Growing vectors

small_vector <- c(11, 12, 13, 14)

small_vector
## [1] 11 12 13 14
small_vector <- c(small_vector, c(4, 3, 2, 1))

small_vector
## [1] 11 12 13 14  4  3  2  1

More info

Read these. Alot. You’ll get spot quizzes occassionally about some esoteric edge cases.

help("vector")
help("character")
help("integer")
help("numeric")
help("double")
help("logical")

help("c")
help("append")

help("mode")
help("typeof")

help("NA")

help("Arithmetic")
help("Syntax")

help("::")

?base::Logic

help("[") # Really read this one

On your own

You won’t excel in “Data Science” without having a curiuos nature. Experiment with similar operations above for those and other vector types. Try to generate error messages and then figure out why the error occurred.