Packages

library(lobstr)
library(tidyverse)

Exercise 1

Problem

Can you diagnose what is going on below?

x <- 1:10
y <- x

tracemem(x)
#> [1] "<0x7fc64d46b368>"
c(obj_addr(x), obj_addr(y))
#> [1] "0x7fc64d46b368" "0x7fc64d46b368"
y[1] <- 3
#> tracemem[0x7fc64d46b368 -> 0x7fc64d4cb8e8]: eval eval withVisible withCallingHandlers handle timing_fn evaluate_call <Anonymous> evaluate in_dir block_exec call_block process_group.block process_group withCallingHandlers process_file <Anonymous> <Anonymous> 
#> tracemem[0x7fc64d4cb8e8 -> 0x7fc648a83458]: eval eval withVisible withCallingHandlers handle timing_fn evaluate_call <Anonymous> evaluate in_dir block_exec call_block process_group.block process_group withCallingHandlers process_file <Anonymous> <Anonymous>

Solution

The question is, why are two copies being made? The vector x is of type integer. However, when we do subassignment and change the first component of y to be 3 (of type double) two copies are made. One for the modification of the component, the other for the atomic vector type change.

x <- 1:10
y <- x

tracemem(x)
#> [1] "<0x7fc64d88e0a8>"
c(obj_addr(x), obj_addr(y))
#> [1] "0x7fc64d88e0a8" "0x7fc64d88e0a8"
y[1] <- 3L # type integer
#> tracemem[0x7fc64d88e0a8 -> 0x7fc64d931078]: eval eval withVisible withCallingHandlers handle timing_fn evaluate_call <Anonymous> evaluate in_dir block_exec call_block process_group.block process_group withCallingHandlers process_file <Anonymous> <Anonymous>

Exercise 2

Problem

Starting from 0 we can see that

lobstr::obj_size(integer(0))
#> 48 B
lobstr::obj_size(numeric(0))
#> 48 B

are both 48 bytes. Run the code below and see if you can deduce how R handles these numeric data in memory?

diff(sapply(0:100, function(x) obj_size(integer(x))))
c(obj_size(integer(20)), obj_size(integer(22)))
diff(sapply(0:100, function(x) obj_size(numeric(x))))
c(obj_size(numeric(10)), obj_size(numeric(14)))

Solution

R allocates memory to vectors in chunks. An integer vector of length one is allocated 56 bytes, 8 more than a null integer vector. Since an integer component only requires 4 bytes of memory, an integer vector of length two is also only 56 bytes. R does not need any more memory. Hence, we see that obj_size(integer(1)) and obj_size(integer(2)) are the same. The diff() function calls give you an idea as to how memory is allocated in chunks.