My Rust Journey - 4 oct 2024
This is the third post about my journey learning the Rust programming language using the Rust Book. Previous posts include:
Chapter 3: Shadowning, mutability, variables and constants, data types, and control flow
I am documenting this as I think it is a useful thing to do for people interested in learning Rust from my non-developer perspective.
At this stage, you have already installed Rust on your machine, and you are ready to write and run your first Rust programs.
I am using VS code with the rust-analyzer extension. I am working on an M1-mac.
The following tutorial will cover Chapter 4 of the Rust Book. It is meant to be a summary and used with the book as a complementary source of information.
Ownership in Rust allows for memory safety without needing a garbage collector. In Rust, memory is managed through a set of rules that the compiler checks.
Stack and Heap Memory
Stack | Heap |
---|---|
It stores values in order and removes them in the opposite order. | Less organized than stack, allocation process through pointers |
Data must have known size. | The pointer has a fixed size, and stored data does not |
Fast read/write | Slow read/write because of the allocation process |
The scope is defined as the range within a program when the value is valid.
The scope is defined with {}
. The code below will return an error because s
is called outside its scope.
fn main() {
{
let s = "hello"; //string literal
}
println!("{s}")
}
String literals are immutable string values that are hardcoded into the program. This is good if we know the string when we write the program, but what if we do not? What if we want to use user inputs? For these situations, Rust has the String
type that can store text on the heap. Such text is unknown to us at compile time. The code below created a String
variable s
, appends to it a literal, and prints s
:
let mut s = String::from("hello"); //String type
s.push_str(", world!");
println!("{s}");
The call String::from
requests the memory from the memory allocator at runtime. When the variable goes out scope, Rust automatically calls the drop
function (at the end of {}
).
Moving Variables
A String
is made of three parts stored on the stack memory:
A pointer to the heap memory holding the string content Length, i.e., how much memory in bytes is currently in use Capacity, i.e., the total amount of memory in bytes received by the allocator
The code below:
let s1 = String::from("hello");
let s2 = s1;
copies the pointer, length, and capacity are copied, but the data on the heap are not. That is, Rust does kind of a shallow copy (and not a deep copy). Both data pointers for s1
and s2
are pointing to the same location. When those variables go out of scope, they will try to free the same memory (double-free error). Freeing memory twice can lead to memory corruption and vulnerabilities. To ensure safety, after declaring s2
, Rust does not consider s1
as valid anymore. Because s1
is invalid after declaring s2
, we are not talking about shallow copy but moving variables. We say s1
was moved into s2
.
To make a deep copy, we can use the clone
method.
let s1 = String::from("hello");
let s2 = s1.clone(); // deep copy
In this way, both variables are valid.
Ownership and Functions
Let’s take the example below.
let s = String::from("hello");
takes_ownership(s); // s is moved into fn, no longer valid after (drop function)
let x = 5;
makes_copy(x); // x is moved into fn, by x is Copy trait, and thus still valid afterwards
Where
fn takes_ownership(some_string: String){
println!("{some_string}");
}
fn makes_copy(some_integer: i32){
println!("{some_integer}");
}
The s
variable goes out of scope after the function takes_ownership
is called. This is because s
, is of String type, and the drop function is called when it goes out of scope. The drop function is called for those types that own resources and have thus allocated memory (heap). The drop function is not called for primitive types like integers that implement the Copy trait and thus do not move and are trivially copied (stack memory). This is why x
is valid after calling the function makes_copy
.
References and Borrowing
A reference is sort of a pointer in that we can use it to follow data stored at a specific address and is guaranteed to point to a valid value for the life of that reference.
See the example below:
let s1 = String::from("hello");
let len = calculate_length(&s1); //borrows s1 to calculate length
println!("The length of '{s1}' is {len}."); //using s1 and len
Where
fn calculate_length(s: &String) -> usize { // s is a reference to a String
s.len()
} // s goes out of scope, but because it is a reference, the String won’t be dropped
The &s1
allows us to refer to s1
without taking ownership of it. Consequently, s1
will not be dropped when the reference stops being used. The action of creating a reference is called “borrowing”. References are immutable by default.
Mutable References
References can be mutable, as shown below.
let mut s = String::from("hello");
change(&mut s);
Where
fn change(some_string: &mut String) {
some_string.push_str(", world!");
}
You cannot borrow s
as a mutable reference more than once. This restriction avoids “data races” at compile time due to:
Two or more pointers access data at the same time
At least one of the pointers is used to write data
There is no mechanism being used to synchronize access to the data
You cannot have simultaneous mutable references, but you can have multiple mutable references in a controlled fashion, as shown below:
{
let r1 = &mut s;
} // mutable reference goes out of scope
let r2 = &mut s;
You cannot have a mutable reference when there is already an immutable reference of the same value. The immutable reference must go out of scope before introducing an immutable reference to the same value. See below:
let mut s = String::from("hello");
let r1 = &s;
let r2 = &s;
println!("{r1} and {r2}"); // two immutable reference out of scope
let r3 = &mut s; // new mutable reference to the same value
println!("{r3}");
Dangling References
Dangling references happen when the memory where a pointer points to is freed while keeping the pointer. See below:
fn dangle() -> &String {
let s = String::from("hello");
&s
}
The function dangle
returns a reference to the String s
. Because s
is defined with fn, it goes out of scope when the function is called, and the compiler will return an error.
Slice Type
String Slices
A string slice is a reference to a part of a String.
let s = String::from("Hello, world!");
let hello = &s[0..6];
let world = &s[7..13];
Where the syntax is [start index..end index]
, and end index
is one more than last position in the slice. In Rust, you can shorten slices as follows:
&s[0..6]
can be &s[..6]
(slice from the start)
&s[7..13]
can be &s[7..]
(slice up to the end)
&s[0..13]
can be &s[..]
(slice entire string)
Let’s take a look at the function below:
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes(); //conversion to byte array
for (i, &item) in bytes.iter().enumerate() { // iteration over bytes
if item == b' ' {
return &s[0..i];
}
}
&s[..]
}
The function takes a reference to a String as input and returns a string slice. The String is first converted into a byte array, then a for loop iterates through the bytes, and if it encounters a space, it returns a string slice from string start to that space. If there is no space, it just returns the whole string slice.
We can take slices of literals and String values. We can thus improve the previous function as follows:
fn first_word(s: &str) -> &str {
This allows us to pass both Strings and string literals because string literals are already string slices. If we have a string slice, we can pass it directly, while if we have a String, we can just pass a slice of that String.
So far we learned the basics of ownership, references and borrowing, and slices. See you in the next post!
🔥 Web3 explained from the non-developer's POV. 🚀 Helping Polkadot users explore the ecosystem with confidence. I post daily on socials. Opinions are mine.
2 comments