class: center, middle, inverse, title-slide .title[ # Vectors and iteration ] .author[ ### INFO 5940
Cornell University ] --- <img src="https://r4ds.had.co.nz/diagrams/data-structures-overview.png" width="60%" style="display: block; margin: auto;" /> --- class: inverse, middle # Atomic vectors --- ## Logical vectors ```r parse_logical(c("TRUE", "TRUE", "FALSE", "TRUE", "NA")) ## [1] TRUE TRUE FALSE TRUE NA ``` -- ## Numeric vectors ```r parse_integer(c("1", "5", "3", "4", "12423")) ## [1] 1 5 3 4 12423 parse_double(c("4.2", "4", "6", "53.2")) ## [1] 4.2 4.0 6.0 53.2 ``` -- ## Character vectors ```r parse_character(c("Goodnight Moon", "Runaway Bunny", "Big Red Barn")) ## [1] "Goodnight Moon" "Runaway Bunny" "Big Red Barn" ``` --- ## Scalars ```r (x <- sample(10)) ``` ``` ## [1] 10 6 5 4 1 8 2 7 9 3 ``` ```r x + c(100, 100, 100, 100, 100, 100, 100, 100, 100, 100) ``` ``` ## [1] 110 106 105 104 101 108 102 107 109 103 ``` ```r x + 100 ``` ``` ## [1] 110 106 105 104 101 108 102 107 109 103 ``` --- ## Vector recycling ```r # create a sequence of numbers between 1 and 10 (x1 <- seq(from = 1, to = 2)) ``` ``` ## [1] 1 2 ``` ```r (x2 <- seq(from = 1, to = 10)) ``` ``` ## [1] 1 2 3 4 5 6 7 8 9 10 ``` ```r # add together two sequences of numbers x1 + x2 ``` ``` ## [1] 2 4 4 6 6 8 8 10 10 12 ``` --- ## Subsetting vectors ```r x <- c("one", "two", "three", "four", "five") ``` * With positive integers ```r x[c(3, 2, 5)] ## [1] "three" "two" "five" ``` * With negative integers ```r x[c(-1, -3, -5)] ## [1] "two" "four" ``` * Don't mix positive and negative ```r x[c(-1, 1)] ## Error in x[c(-1, 1)]: only 0's may be mixed with negative subscripts ``` --- ## Subset with a logical vector ```r (x <- c(10, 3, NA, 5, 8, 1, NA)) ``` ``` ## [1] 10 3 NA 5 8 1 NA ``` ```r # All non-missing values of x !is.na(x) ``` ``` ## [1] TRUE TRUE FALSE TRUE TRUE TRUE FALSE ``` ```r x[!is.na(x)] ``` ``` ## [1] 10 3 5 8 1 ``` ```r # All even (or missing!) values of x x[x %% 2 == 0] ``` ``` ## [1] 10 NA 8 NA ``` --- class: inverse, middle # Lists --- ## Lists ```r x <- list(1, 2, 3) x ``` ``` ## [[1]] ## [1] 1 ## ## [[2]] ## [1] 2 ## ## [[3]] ## [1] 3 ``` --- ## Lists: `str()` ```r str(x) ``` ``` ## List of 3 ## $ : num 1 ## $ : num 2 ## $ : num 3 ``` ```r x_named <- list(a = 1, b = 2, c = 3) str(x_named) ``` ``` ## List of 3 ## $ a: num 1 ## $ b: num 2 ## $ c: num 3 ``` --- ## Store a mix of objects ```r y <- list("a", 1L, 1.5, TRUE) str(y) ``` ``` ## List of 4 ## $ : chr "a" ## $ : int 1 ## $ : num 1.5 ## $ : logi TRUE ``` --- <img src="../../../../../../../../img/xzibit-lists.jpg" width="80%" style="display: block; margin: auto;" /> --- ## Nested lists ```r z <- list(list(1, 2), list(3, 4)) str(z) ``` ``` ## List of 2 ## $ :List of 2 ## ..$ : num 1 ## ..$ : num 2 ## $ :List of 2 ## ..$ : num 3 ## ..$ : num 4 ``` --- ## Secret lists ```r str(gun_deaths) ``` ``` ## spec_tbl_df [100,798 × 10] (S3: spec_tbl_df/tbl_df/tbl/data.frame) ## $ id : num [1:100798] 1 2 3 4 5 6 7 8 9 10 ... ## $ year : num [1:100798] 2012 2012 2012 2012 2012 ... ## $ month : chr [1:100798] "Jan" "Jan" "Jan" "Feb" ... ## $ intent : chr [1:100798] "Suicide" "Suicide" "Suicide" "Suicide" ... ## $ police : num [1:100798] 0 0 0 0 0 0 0 0 0 0 ... ## $ sex : chr [1:100798] "M" "F" "M" "M" ... ## $ age : num [1:100798] 34 21 60 64 31 17 48 41 50 NA ... ## $ race : chr [1:100798] "Asian/Pacific Islander" "White" "White" "White" ... ## $ place : chr [1:100798] "Home" "Street" "Other specified" "Home" ... ## $ education: Factor w/ 4 levels "Less than HS",..: 4 3 4 4 2 1 2 2 3 NA ... ``` --- <img src="https://r4ds.had.co.nz/diagrams/lists-subsetting.png" width="60%" style="display: block; margin: auto;" /> --- ## Exercise on subsetting vectors <img src="https://media.giphy.com/media/uLUgjrzvQPXV5sTZeY/giphy.gif" width="50%" style="display: block; margin: auto;" />
15
:
00
--- class: inverse, middle # Iteration --- ## Iteration ```r df <- tibble( a = rnorm(10), b = rnorm(10), c = rnorm(10), d = rnorm(10) ) ``` ```r median(df$a) ## [1] 0.1642894 median(df$b) ## [1] 0.01641118 median(df$c) ## [1] 0.2734794 median(df$d) ## [1] -0.639297 ``` --- ## Iteration with `for` loop ```r output <- vector(mode = "double", length = ncol(df)) for (i in seq_along(df)) { output[[i]] <- median(df[[i]]) } output ``` ``` ## [1] 0.16428940 0.01641118 0.27347942 -0.63929695 ``` --- ## Output ```r output <- vector(mode = "double", length = ncol(df)) ``` ```r vector(mode = "double", length = ncol(df)) ## [1] 0 0 0 0 vector(mode = "logical", length = ncol(df)) ## [1] FALSE FALSE FALSE FALSE vector(mode = "character", length = ncol(df)) ## [1] "" "" "" "" vector(mode = "list", length = ncol(df)) ## [[1]] ## NULL ## ## [[2]] ## NULL ## ## [[3]] ## NULL ## ## [[4]] ## NULL ``` --- ## Sequence ```r i in seq_along(df) ``` ```r seq_along(df) ``` ``` ## [1] 1 2 3 4 ``` --- ## Body ```r output[[i]] <- median(df[[i]]) ``` --- ## Preallocation .panelset[ .panel[.panel-name[Code] ```r # no preallocation mpg_no_preall <- tibble() for(i in 1:100){ mpg_no_preall <- bind_rows(mpg_no_preall, mpg) } # with preallocation using a list mpg_preall <- vector(mode = "list", length = 100) for(i in 1:100){ mpg_preall[[i]] <- mpg } mpg_preall <- bind_rows(mpg_preall) ``` ] .panel[.panel-name[Plot] <img src="index_files/figure-html/unnamed-chunk-6-1.png" width="70%" style="display: block; margin: auto;" /> ] ] --- ## Map functions * Why `for` loops are good * Why `map()` functions may be better * Types of `map()` functions * `map()` makes a list * `map_lgl()` makes a logical vector * `map_int()` makes an integer vector * `map_dbl()` makes a double vector * `map_chr()` makes a character vector --- ## Map functions ```r map_dbl(df, mean) ``` ``` ## a b c d ## 0.1694536 -0.1974360 0.3113976 -0.5095255 ``` ```r map_dbl(df, median) ``` ``` ## a b c d ## 0.16428940 0.01641118 0.27347942 -0.63929695 ``` ```r map_dbl(df, sd) ``` ``` ## a b c d ## 0.5311992 1.0300788 0.8834578 1.0414939 ``` --- ## Map functions ```r map_dbl(df, mean, na.rm = TRUE) ``` ``` ## a b c d ## 0.1694536 -0.1974360 0.3113976 -0.5095255 ``` -- ```r df %>% map_dbl(mean, na.rm = TRUE) ``` ``` ## a b c d ## 0.1694536 -0.1974360 0.3113976 -0.5095255 ``` --- ## Exercise on writing iterative operations <img src="https://media.giphy.com/media/DC2YXS4efT0R4wwXoY/giphy.gif" width="80%" style="display: block; margin: auto;" />
10
:
00