Traits in Rust 1: Traits of our own
Disclaimer: This section assumes you have at least read the chapter about ownership and string slices
If not, check it out, and also make sure you get some hands-on experience with Rust
Alternatively, for hands-on experience you can also try writing a grep clone:
- program accepts two parameters,
file pathandstring- the file in question is read line by line
- print to standard output every line that contains string
Your program should gracefully handle errors and invalid input, ie. don't do things that can panic, don't use
.unwrap()or.expect(). The grep clone should not crash if parameters are missing, file does not exist, or is unreadable as text.Design your program in such a way that you can write some tests.
Prerequisites
- All of the previous chapter
- Rust modules
- Tuples
- .expect() and .unwrap()
- Generic / trait bound syntax
One of the biggest culture shocks when first getting into Rust is its approach to Object-Oriented Programming and its memory management model.
Rust follows a (mostly)-lexical [RAII-like] memory management model, wherein declaration means allocation and value going out of scope means de-allocation. No garbage collector is involved, which allows greater control over memory and makes Rust more suitable in embedded and performance-critical applications.
To read up on Rust's memory management, check out the relevant section of Ownership and string slices.
According to Statista, some of the most widely used programming languages are:
- JavaScript
- Python
- Java
- TypeScript
- C#
- C++
- PHP (yuck :))
These all have object-oriented programming on basis of types, most often they also use a class-based inheritance model. They are built from ground up around this paradigm and foster idioms specific to it.
Rust teeters on the edge of OOP and Functional Programming, and so it uses a different model.
Object-oriented features in Rust
OOP in Rust is one of the biggest culture shocks newcomers experience:
- Rust does not have classes
- Rust does not have type inheritance
Visibility and privacy
Just like you might be used to from your other languages, Rust has methods and visibility modifiers to facilitate encapsulation and information hiding.
For instance:
#![allow(unused)] fn main() { #![allow(unused_code)] pub fn public_function() { println!("Available from everywhere"); } fn private_function() { println!("Only accessible by this module and its descendants"); } pub(crate) my_public_in_crate_function /// This is roughly equivalent to the following file structure /// /// my_module.rs /// my_module/ /// - child_module.rs /// - child_module/ /// - grand_child_module.rs /// - other_child.rs mod my_module { pub mod child_module { pub mod grand_child_module { pub(super) fn public_in_grand_child_() { println!("This function is only accessible from this module (and its descendants) and its parent (super)"); } pub(self) fn public_in_self() { println!("Only accessible by this module and its descendants, effectively same as private"); } pub(in crate::my_module) fn public_in_my_module() { println!("Public from my_module onwards"); } } } pub mod other_child { pub(super) fn public_in_my_module() { println!("Accessible from my_module onwards"); } } } }
As you can see, Rust allows a fair a mount of controls over visibility and privacy. You can read up on it more here: https://doc.rust-lang.org/reference/visibility-and-privacy.html
Methods
Methods are split from data via an implementation block:
#![allow(unused)] fn main() { pub struct AveragedCollection { list: Vec<i32>, average: f64, } impl AveragedCollection { pub fn add(&mut self, value: i32) { self.list.push(value); self.update_average(); } pub fn remove(&mut self) -> Option<i32> { let result = self.list.pop(); match result { Some(value) => { self.update_average(); Some(value) } None => None, } } pub fn average(&self) -> f64 { self.average } fn update_average(&mut self) { let total: i32 = self.list.iter().sum(); self.average = total as f64 / self.list.len() as f64; } } }
(taken from rust book [17.2])
User-defined types (structs and enums)
In place of classes, Rust's user-defined types fall into these two categories:
- structures - can be C-like structs or tuples. Rust also allows empty, zero-sized structs (also called unit structs) as a useful abstraction for working with traits
- enums - essentially algebraic data types you might be used to from Haskell / ML / OCaml / Scala and so on. In Rust, they are implemented as tagged unions
BTW: Rust also supports plain C-like unions, however, these are very rarely used, and their handling requires unsafe code, since the compiler can't always guarantee you select the correct union member. (Compare with enums where the valid union member is stored in the tag, so it is always known to the compiler)
There are certain conventions observed when working with structs and enums, which you can read about here:
Traits
The heavy lifter of Rust's OOP story are not structs or enums, but rather traits. A trait describes common behavior, in less abstract terms, it is essentially a set of methods a type is expected to provide, if it implements (satisfies) the trait. There is no such thing as duck typing in Rust, so you have to pledge allegiance to a trait manually:
#![allow(unused)] fn main() { trait Quack { fn quack(&self); } struct Duck; // Duck implements Quack // it has the trait method quack() impl Quack for Duck { fn quack(&self) { println!("quack"); } } struct Human; // Human does not implement Quack // it has a **type** method quack() // but that is no substitute for the real // art impl Human { fn quack(&self) { println!("I quack, therefore I am"); } } }
TIP: A trait may also have zero methods. We refer to these as marker traits. Several of these are found in the compiler and they are usually ascribed special meaning, for example, the std::marker::Copy trait enables copy semantics for a type, as mentioned in the chapter about ownership
The standard library has many traits in it, some of which
are special, and describe specific behavior, such as Send
and Sync, which denote the safety (or lack thereof) of
moving and accessing type between threads, or Copy, which
switches the semantics for a type from move to copy semantics
(e.g. all primitive types are Copy).
You can see some of the commonly used traits in the following links:
- https://stevedonovan.github.io/rustifications/2018/09/08/common-rust-traits.html
- https://blog.rust-lang.org/2015/05/11/traits.html
As especially the second link elaborates, traits are the cornerstone of Rust generics, for which Rust provides two models, static and dynamic dispatch.
Static dispatch
Here is how we can use our quackers with static
dispatch by expanding on our previous example with a new duck and a generic
function called ducks_say():
trait Quack { fn quack(&self); } struct Duck; // Duck implements Quack // it has the trait method quack() impl Quack for Duck { fn quack(&self) { println!("quack"); } } struct Human; // Human does not implement Quack // it has a **type** method quack() // but that is no substitute for the real // art impl Human { fn quack(&self) { println!("I quack, therefore I am"); } } struct FormalDuck { name: String } impl FormalDuck { // create a new duck fn new(name: String) -> Self { Self { name } } } impl Quack for FormalDuck { fn quack(&self) { println!( "Good evening, ladies and gentlemen, my name is {}. Without further ado: quack", self.name ); } } // You could also write // fn ducks_say<T>(quacker: T) // where // T: Quack // // Longer trait bounds are generally more suitable in the where block for readability reasons fn ducks_say<T: Quack>(quacker: T) { quacker.quack() } // the T: Trait (+ Othertrait...)* syntax is called a trait bound fn main() { let duck = Duck; let human = Human; let formal = FormalDuck::new("Ernesto".to_string()); ducks_say(duck); //ducks_say(human); <-- this won't compile because Human does not implement Quack ducks_say(formal); }
Functions that don't specify any trait bounds are seldom useful and you'll rarely see them in Rust.
However, you might be surprised to learn that this will not compile:
fn no_param<T>(_: T) {} fn main() { let my_str = "Hello, Braiins!"; no_param(*my_str); // calling no_params<str> }
If you look at the error clicking Run on this example prints, you will see ?Sized
mentioned.
The trick here is that
even generic parameters without any written trait bounds have a hidden trait bound, which
is T: Sized, where Sized means "This type's size is known at compile time".
Rust has support for dynamically-sized types, but if you want to work with them
directly, you need to opt out of this implicit trait bound with the T: ?Sized syntax.
This syntax and behavior is at the time of this writing unique for Sized trait.
The benefit of static dispatch is that it is a form of generics which utilizes monomorphization. This means that a method is generated for each type configuration required, and no such thing as these generics exists at runtime. This is a pathway to other optimizations, as after monomorphization, you only have ordinary static code. Static dispatch tends to be fast, but increases binary sizes.
Generic param bounds
Keep in mind that trait bounds can be added to generic params on types, generic params of traits and traits themselves. For traits, we call the traits specified in the bound supertraits. For example:
#![allow(unused)] fn main() { use std::path::Path; use std::fs::File; use std::io::Write; // <- to be able to use methods from a trait // implementations, you have to import it // many traits in standard lib are imported // automatically use std::fmt::Display; // Display is the supertrait of Saveable // Saveable can only be implemented on types which implement Display // Trait ToString is implemented for every type T such that T: Display trait Saveable: Display { // try to save the type implementing this to a type specified by Path fn save<P>(&self, path: P) -> std::io::Result<()> where P: AsRef<Path> // accept any type that we can infallibly convert to &Path { let mut file = File::create(path.as_ref())?; writeln!(file, "{}", self.to_string())?; Ok(()) } } }
Dynamic dispatch
The other option is dynamic dispatch. Dynamic dispatch represents a model of generics you might be more familiar with from languages like C#, Java and so on. There is no monomorphization being done and data is instead passed as a pair of virtual method table (also known as dispatch table) and pointer to data.
While this is in other languages often completely behind the scenes, Rust requires you to explicitly
represent this by actually passing your data behind a pointer of your choosing
In most cases, a simple borrow reference is enough. Here is an alternative implementation
of ducks_say():
trait Quack { fn quack(&self); } struct Duck; // Duck implements Quack // it has the trait method quack() impl Quack for Duck { fn quack(&self) { println!("quack"); } } struct Human; // Human does not implement Quack // it has a **type** method quack() // but that is no substitute for the real // art impl Human { fn quack(&self) { println!("I quack, therefore I am"); } } struct FormalDuck { name: String } impl FormalDuck { // create a new duck fn new(name: String) -> Self { Self { name } } } impl Quack for FormalDuck { fn quack(&self) { println!( "Good evening, ladies and gentlemen, my name is {}. Without further ado: quack", self.name ); } } // dynamically dispatching ducks_say() fn ducks_say(quacker: &dyn Quack) { quacker.quack() } fn main() { let duck = Duck; let formal = FormalDuck::new("Ernesto".to_string()); ducks_say(&duck); ducks_say(&formal); }
When data is passed through dynamic dispatch, we call objects
of the type dyn Trait trait objects. Trait objects have
no known size, so they have to be behind a pointer.
The benefit of dynamic dispatch is that it makes for smaller binaries, and is, well, more dynamic. Since the actual information of the type is lost, you can re-assign a trait object variable to a trait object made from a different type, or you can use trait objects to model heterogeneous collections.
TIP: If you ever need to store a trait object somewhere, consider using
a smart pointer such as Box
(plain heap-stored pointer)
or Rc (reference-counted heap-stored single-threaded pointer).
We have already seen an example of polymorphism in Rust. On a more theoretical level, Rust uses, instead of subclasses and inheritance, generics over types with certain trait bounds, this model is called bounded parametric polymorphism.
To learn more about OOP and traits in Rust, check out the following links:
- https://doc.rust-lang.org/book/ch17-01-what-is-oo.html
- https://web.mit.edu/rust-lang_v1.25/arch/amd64_ubuntu1404/share/doc/rust/html/book/second-edition/ch19-03-advanced-traits.html
- https://blog.logrocket.com/rust-traits-a-deep-dive/
The Task: Sequence generators
You are likely to meet many numeric sequences. Common ones you might encounter can be the powers of two, the fibbonaci series, factorials, or for example, the triangular numbers.
For this project, it will be your task to implement a trait called Sequence with the following methods:
n(), returning the number of the element in the sequence, zero-indexed (ie. for first element return 0, second return 1, third return 2)generate() -> Option<u128>, returning the next element in the sequence, advancing it. It should returnNoneif generating the next element would overflowu128reset(), resetting the sequence
Consider which ones of these should borrow self mutably and which immutably.
Next, create types and implement Sequence for the following sequences:
- fibbonaci
- factorial
- powers of two
- triangular numbers
Also add a new method for every of these types which properly instantiates the structures.
When you are done, it should be possible to put an instance of any of these types as parameter into the following function:
#![allow(unused)] fn main() { fn first_five<S>(s: &mut S) -> (u128, u128, u128, u128, u128) where S: Sequence { if let (Some(first), Some(second), Some(third), Some(fourth), Some(fifth)) = (s.generate(), s.generate(), s.generate(), s.generate(), s.generate()) { (first, second, third, fourth, fifth) } else { panic!("All of the sequences should be able to produce five elements") } } }
Use this function to construct a test verifying your sequences against the correct outputs:
- fibbonaci:
(0, 1, 1, 2, 3) - factorial:
(1, 2, 6, 24, 120) - powers of two:
(1, 2, 4, 8, 16) - triangular numbers:
(1, 3, 6, 10, 15)
End product
In the end you should be left with a well prepared project, that has the following:
- documented code explaining your reasoning where it isn't self-evident
- optionally tests
- and an example or two where applicable
- clean git history that does not contain fix-ups, merge commits or malformed/misformatted commits
Your Rust code should be formatted by rustfmt / cargo fmt and should produce no
warnings when built. It should also work on stable Rust and follow the Braiins Standard