Module 8: (Unsafe) Pointers
This module looks at the following concepts:
- What are pointers and what they useful for?
- Why are pointers dangerous?
Pointers
In the previous module, we learned about the structure of the computer memory, specifically:
- It is made out of 1 byte sized cells, each labeled by unique address.
- It consists of a stack (fixed sized automatic allocation) and a heap (dynamic allocation with manual memory management).
So far, we looked at many programs that store and manipulate data, and while we know now that this data is stored in memory in some cells with an address, we have only referred to them using variable names. For example:
fn main() { let x1: i32 = 10; let x2: i32 = 20; let y: i32 = x1 + x2; println!("{y}"); // This program stores 3 pieces of data, 10, 20, and 30 // in different locations in memory. // We did not need to know where these locations are or what their // addresses are. Instead, we could refer to the values using // their variable names, e.g., x1 and x2. println!("address of x1 {:p}", &x1); println!("address of x2 {:p}", &x2); println!("address of x3 {:p}", &y); }
This was true because we either used fixed size data that is stored on the stack (e.g., x1, x2, and y in the example above are all i32s stored on the stack) or dynamic sized data using Rust provided dynamic types, such as Vec and String. While Vec and String perform a variety of dynamic heap allocations and memory manipulation, they do it behind the scenes so that programmers like us do not have to worry about then.
However, to get better at programming, we have to look at how these manipulation work behind the scenes.
Pointers and Addresses
Look at the below code.
fn main() { let x1: i32 = 10; let ptr_x1: *const i32 = &x1 as *const i32; println!("address of x1 {:p}", &x1); println!("size of x1 {}", size_of::<i32>()); println!("pointer ptr_x1 points to address {:p}", ptr_x1); println!("size of ptr_x1 {}", size_of::<*const i32>()); println!("address of ptr_x1 {:p}", &ptr_x1); }
We have two variables here. The first is x1, which is located at some address.
The second, ptr_x1 has a strange type we have not seen before *const i32.
This type is read as a constant pointer to an i32. ptr_x1 does not actually
contain an i32 inside of it. Instead, it contains a memory address, and if we follow
that memory address, we would find an i32 there.
We can confirm this by looking at the output of the above code:
- The address inside
ptr_x1matches the address ofx1. - The size of
ptr_x1is 8 bytes, which is the size required to store addresses, while an i32 is 4 bytes.
Notice that both x1 and ptr_x1 are variables of a fixed size. They are both stored on the stack.
x1 is stored as 4 bytes that contain the value 10, and ptr_x1 is stored as 8 bytes that contain
the address of x1. We can confirm this by observing that the address of ptr_x1 is 4 bytes after the address of x1.
In other words, the memory layout for this program looks as below.

Dereferencing a Pointer
The most important operation we can do to a pointer is dereference it. This tells the computer to follow the address stored inside the pointer and look at whatever value is stored there.
fn main() { let x1: i32 = 10; let ptr_x1: *const i32 = &x1 as *const i32; unsafe { println!("{}", *ptr_x1); } }
Dereferencing is done using the * operator. So *ptr_x1 tells the computer to dereference ptr_x1, which in this case
leads to x1 and the value 10.
unsafe: Dereferencing a pointer is one of the most dangerous things you can do on a computer: at the time of dereferencing, there is no guarantee
that the address you dereference is a valid address, contains a value of the type you think it does, has its data initialized, or points
to data that has not been modified or used by other parts of the program.
As a result, using it require explicitly using an unsafe block. This tells the compiler to allow us to use unsafe operations that are
potentially dangerous, and that we as programmers will take responsibility for them.
A pointer simply comprises an address of some piece of data. It does not contain any fact or copies of that data. If after creating the pointer, the data the pointers points to changes, dereferencing the pointer would show the new data.
fn main() { let mut x1: i32 = 10; let ptr_x1: *const i32 = &x1 as *const i32; x1 = 30; unsafe { println!("{}", *ptr_x1); } }
We can also use dereferencing to modify the data the pointer points to.
fn main() { let mut x1: i32 = 10; let ptr_x1: *mut i32 = &mut x1 as *mut i32; unsafe { *ptr_x1 = 50; } println!("{x1}"); }
Note that this requires a mutable pointer, instead of a const pointer. Try to change the muts to const in the code above and see what errors the Rust compiler gives you!
Allocating Data on the Heap
As discussed in the previous module, a String or a Vec simply contains a len, capacity, and a pointer to the characters or elements,
which are stored on the heap. This enables dynamically resizing, e.g., by adding many elements to the vector.
fn main() { let mut v: Vec<i32> = vec![10, 20]; v.push(30); println!("{:?}", v); }
After line 2 when the initial vector is created the memory layout looks like this:

Then, after pushing 30 to the vector, the memory layout becomes:

How does this work? Programmers can tell the computer to allocate memory on the heap using a memory allocator (malloc).
extern crate libc; fn main() { unsafe { let ptr: *mut i32 = libc::malloc(4) as *mut i32; println!("ptr points to address {:p}", ptr); } }
Allocation, like most pointer operations, is unsafe. When we allocate memory on the heap, we need to tell the computer how many bytes to allocate. In this case, we allocate 4 bytes, which is the size of an i32.
But what values does the allocated memory contain? The answer is –– it depends. Integers are really simple types with a simple memory layouts, allocators often initializes them to 0. However, a more complicated type, such as a string or a vector, may not be initialized properly, because it is more complex, e.g., it contains more pointers inside of it!
Thus, you are not supposed to read data on the heap until you have initialized first, using std::ptr::write(...).
extern crate libc; fn main() { unsafe { let my_ptr: *mut String = libc::malloc(size_of::<String>()) as *mut String; println!("my_ptr points to address {:p}", my_ptr); std::ptr::write(my_ptr, String::from("hello!")); println!("{}", *my_ptr); } }
Freeing Data on the Heap
Remember that unlike the stack, the heap is not automatically managed by the computer or Rust.
Thus, if we manually allocate data on the heap, we are responsible for manually freeing it when we are done with it.
extern crate libc; fn main() { unsafe { let my_ptr: *mut String = libc::malloc(size_of::<String>()) as *mut String; println!("my_ptr points to address {:p}", my_ptr); std::ptr::write(my_ptr, String::from("hello!")); println!("{}", *my_ptr); libc::free(my_ptr as *mut libc::c_void); } }
However, freeing only frees the memory allocated by malloc. If that memory in of itself contains more pointers to heap allocations, those
will not get freed. For example, the string in the above example does not get properly destructed. To do so, we must remember to
call std::ptr::read(...) before freeing.
fn main() { unsafe { let my_ptr: *mut String = libc::malloc(size_of::<String>()) as *mut String; println!("my_ptr points to address {:p}", my_ptr); std::ptr::write(my_ptr, String::from("hello!")); println!("{}", *my_ptr); std::ptr::read(my_ptr); libc::free(my_ptr as *mut libc::c_void); } }
Pointer Arithmetic
We can also allocate sequence of elements on the heap. For example,
fn main() { unsafe { // Allocate 2 i32s (2 * 4 bytes = 8 bytes). let my_ptr: *mut i32 = libc::malloc(size_of::<i32>() * 2) as *mut i32; println!("address of my_ptr is {:p}", my_ptr); } }
In this case, the returned pointer points to the first one of the two i32s. We can get a pointer to the second one using add(...).
fn main() { unsafe { let ptr0: *mut i32 = libc::malloc(size_of::<i32>() * 2) as *mut i32; let ptr1: *mut i32 = ptr0.add(1); println!("address of ptr0 is {:p}", ptr0); println!("address of ptr1 is {:p}", ptr1); } }
Notice how ptr1 is 4 bytes (i.e., one i32 value) away from ptr0.
Remember that this kind of pointer manipulations can be very dangerous. What if we moved the pointer forward too much and got outside the allocated region? It is your job as a programmer to ensure this does not happen!
You will have a chance to practice dealing with pointers extensively when implementing the FastVec project.
Why are Pointers Dangerous?
The reason pointers are really dangerous is that the Rust compiler cannot guarantee whether they point to valid data or not.
For example, the pointer may point to data that was freed, ran out of scope and was destroyed, or be the result of some operation that take it out of range.
For example, one of the most common mistakes programmers in pointer-based languages make is to double-free. If you run the above program, you will see that it unexpectedly crash due to a double freeing error!
fn main() { unsafe { let ptr0: *mut i32 = libc::malloc(size_of::<i32>() * 2) as *mut i32; libc::free(ptr0 as *mut libc::c_void); libc::free(ptr0 as *mut libc::c_void); } }
More serious errors are also possible (and sometimes easy to make). Consider the below program.
fn main() { let mut ptr: *mut Vec<i32> = std::ptr::null_mut(); { let mut v: Vec<i32> = vec![1, 2, 3, 4, 5]; ptr = &mut v as *mut Vec<i32>; // v goes out of scope here and gets destroyed. } println!("ptr is now a dangling ptr"); unsafe { println!("{}", (&*ptr)[0]); } }
To avoid these issues, Rust actively dissuades the use of raw pointers. Instead, Rust encourages us to use its safe references, a pointer like concept that provides many of the advantages of pointers without the risk!
We will see references in more detail in the next module.