2: Rust Basics: Syntax and Variables, Compiling programs with Cargo
Now that we have a working Rust installation, it is finally time to dig in and get familiar with Rust as a language. In this lesson, we will gain an overview of Rust's syntax, the usage of variables, and how to work with Cargo to compile our programs.
You may be wondering, why we dedicate an entire lesson to this - the reason is simple, variables are not as trivial in Rust, and they touch upon topics such as pattern-matching, shadowing, and ownership, so there is a plenty of topics we need to introduce at least in in the briefest of terms.
Without a further ado, let's get into Rust syntax. There is no way to really sugar coat it, so let's just go concept by concept.
Comments
Comments in Rust start with //. Multi-line comments can be written between /* and */.
Multi-line comments can be nested freely, so you don't have to worry about existing comments
if you want to disable a part of code that may already have comments.
#![allow(unused)] fn main() { // This is a single-line comment /* This is a multi-line comment */ }
Doc Comments
Rust supports documentation comments that are used to generate external documentation.
Doc comments use three slashes /// or //!. Doc comments are formatted in Markdown
and may contain images, links and code examples.
#![allow(unused)] fn main() { /// This is a doc comment for the following struct. struct MyStruct { // ... we will look at structs in Lesson 4 } //! This is a module-level doc comment. }
The general rule is that the /// doc-comment documents the item right underneath it,
whereas the //! documents the item it is contained in. When documenting modules
declared by file structure, the //! doc comment is the only practical option.
Literals
Literals represent fixed, immutable values, just like in any langauge. In Rust, you have numeric literals, string literals, character literals, and so on.
fn main() { let integer = 10; let floating_point = 3.14; let character = 'a'; let string = "Hello, Rust!"; let boolean = true; }
For numeric types, you can specify the type right at the literal:
fn main() { let integer = 10u32; }
Numbers always have a specific type, which is either the default (u32)
Variable Bindings
Variables in Rust are immutable by default, and you declare them with the let keyword.
fn main() { let x = 5; // immutable variable let mut y = 10; // mutable variable y = 15; // re-assigning mutable variable }
The type can be specified by including a : Type right after the binding identifier:
fn main() { let x: i32 = 5; println!("x is: {}", x); }
We explicitly set the type of the x variable to i32, which is the default, if
no other type can be inferred.
NOTE: Rust is statically typed, meaning that types are always concrete and never change during the runtime of the program. If you do not specify a type in one place or another, it will be deduced from the context, the variable binding will not be untyped.
Etymology
Note that in Rust, what are commonly known as variables in many other languages are referred to as "variable bindings". They are called bindings because they bind a name to a value, essentially tying the name to the value, so we can use the name to refer to the value later in the program.
Shadowing
Variable shadowing is another interesting feature of Rust’s variable bindings. Shadowing occurs when we declare a new variable with the same name as a previous variable. The new variable "shadows" the name of the previous variable, meaning the original is no longer accessible and any future use of the variable will refer to the new one.
Here’s an example of shadowing:
fn main() { let x = 5; println!("x is: {}", x); // prints "x is: 5" let x = "Rust"; println!("x is: {}", x); // prints "x is: Rust" }
In this example, the first let x binds x to the value 5. The second let x shadows
the first, binding x to the value "Rust".
Shadowing is particularly useful when you want to change the type of a variable or declare a new value to the same name immutably. Here's an example where shadowing is used to 'change' the type of a variable:
fn main() { let x = "5"; println!("x is: {}", x); // prints "x is: 5" let x: i32 = x.parse().expect("Not a number"); println!("x is: {}", x); // prints "x is: 5" }
In this case, the binding x is initially a string. Then, it is shadowed by a new x
that is a result of parsing the original string x into an i32. It is a common idiom
to repeat the shadowing of the same variable name as you build from inputs to final
values (parsing is one such case).
Shadowing allows you to reuse variable names, which can lead to more concise and readable code, but it can also introduce bugs if used carelessly, as the original variable becomes inaccessible. On the other hand, you can use shadowing to impose further restrictions that can help keep your code bug free - shadowing a mutable variable as immutable when you know you will no longer need to mutate it.
For starters, we can take the primitive types, here is a handy table of the ones avaible in Rust and their C equivalents:
| Rust Type | Numeric Type | Size (bytes) | Corresponding C Type |
|---|---|---|---|
i8 | Integer | 1 | int8_t |
u8 | Unsigned | 1 | uint8_t |
i16 | Integer | 2 | int16_t |
u16 | Unsigned | 2 | uint16_t |
i32 | Integer | 4 | int32_t |
u32 | Unsigned | 4 | uint32_t |
i64 | Integer | 8 | int64_t |
u64 | Unsigned | 8 | uint64_t |
i128 | Integer | 16 | __int128_t (GCC) |
u128 | Unsigned | 16 | __uint128_t (GCC) |
isize | Integer | Dependent on the architecture | intptr_t |
usize | Unsigned | Dependent on the architecture | uintptr_t |
f32 | Floating Point | 4 | float |
f64 | Floating Point | 8 | double |
The size column indicates the size of each type in bytes. isize and usize are architecture-dependent,
and the i128 and u128 types may have equivalent C types depending on the compiler used,
but you can't find them in the C standard. The f32 and f64 represent floating-point numbers in Rust,
corresponding to float and double in C, respectively.
There are other primitive types in Rust, these are the simple ones:
| Rust Type | Description |
|---|---|
bool | A boolean type representing the values true or false. |
char | A character type representing a single Unicode character, like 'a'. |
str | A string slice type, typically used as &str, representing a reference to a UTF-8 encoded string slice. |
unit | The unit type () representing an empty tuple, often used to signify that a function doesn’t return any meaningful value. |
And then we have three types related to collections:
| Rust Type | Description |
|---|---|
tuple | A collection of values with different types. The size is fixed at compile-time. For example, (i32, f64, &str). |
array | A collection of values with the same type. The size is fixed at compile-time. For example, [i32; 5]. |
slice | A dynamically-sized view into a contiguous sequence, [T]. It is more commonly used as a reference, &[T], representing a view into an array or another slice. |
Blocks
A block in Rust is a group of statements enclosed within curly braces {}. It can be used to group statements together.
fn main() { { let x = 10; println!("x inside block: {}", x); } }
Much like many things in Rust, a block is an expression and you can use it to produce a value to assign to a variable binding:
fn main() { let x = 5; let y = { let temp = x * 2; temp + 1 }; println!("y is: {}", y); // prints "y is: 11" }
In this example, y is assigned the value of the block, which is temp + 1, resulting in 11.
Additionally, Rust allows you to name a block and use the break keyword to exit the block early
and specify the value it should result in. Here’s an example:
fn main() { let x = 5; let y = 'block: { if x < 10 { break 'block x * 2; } x + 1 }; println!("y is: {}", y); // prints "y is: 10" }
These features make Rust's block expressions a powerful tool for structuring your code. You can also use these to prevent polluting your namespace with variables that lean into being named similarly. This feature is also great for macros, which often generate block expressions.
Statements
A statement performs an action. In Rust, each statement ends with a semicolon ;.
fn main() { let x = 5; // statement println!("x is: {}", x); // statement }
If you ommit the semicolon of the final statement in a block, control flow expression
or a function body, it will be considered that block's return value. You can put
any statement there, but keep in mind that most will just return Rust's equivalent
of void, the empty tuple (), often referred to as unit.
For example, in both Rust and C, assignment is an expression, meaning it evaluates to a value. However, there is a key difference between the two languages in how assignment expressions are handled.
In C, an assignment expression evaluates to the value that was assigned, making it useful in certain scenarios, like conditional statements or within other expressions:
#include <stdio.h>
int main() {
int x;
if ((x = 10)) {
printf("x is: %d\n", x); // prints "x is: 10"
}
return 0;
}
In contrast, Rust’s assignment expression always evaluates to the aforementioned unit type ().
This means that you can’t use the value of the assignment in the same way you might in C:
fn main() { let x; if (x = 10) { // This will result in a compile-time error! println!("x is: {}", x); } }
This Rust code will not compile because the expression x = 10 evaluates to (),
and if expects a boolean expression. The unit type () doesn’t carry any meaningful information,
and as such, using assignment as an expression in Rust is not very useful.
In Rust, if you need to assign a value and use it within a condition, you need to separate the assignment and the condition:
fn main() { let x; x = 10; if x == 10 { println!("x is: {}", x); // prints "x is: 10" } }
This design choice in Rust encourages more explicit and clear code, reducing the chance of subtle bugs introduced by assignments inside expressions. There is a further reason in that having the same behavior as C could clash with Rust's memory management model, but we will get back to that later.
Tuple Declarations
A tuple is an ordered list of fixed-size elements, possibly of different types.
fn main() { let tuple = (1, 2.0, "Rust"); let (integer, floating_point, string) = tuple; // destructuring a tuple }
Destructuring a tuple (which in this case creates three separate independent bindings -
integer, floating_point and string) is often the most useful way to deal with a
tuple.
If you want to preserve the tuple and access its elements as elements of a tuple, you would use the dot syntax with an index:
fn main() { let tuple = (1, "hello", 4.5); let (x, y, z) = tuple; println!("x: {}, y: {}, z: {}", x, y, z); // Accessing elements of a tuple let first_element = tuple.0; let second_element = tuple.1; let third_element = tuple.2; println!("First element: {}", first_element); // prints "First element: 1" println!("Second element: {}", second_element); // prints "Second element: hello" println!("Third element: {}", third_element); // prints "Third element: 4.5" }
In Rust, you can access the elements of a tuple using a dot followed by the index of the value you
want to access, starting from 0. So, tuple.0 refers to the first element, tuple.1 to the second, and so on.
This way, you can either destructure the tuple to access its elements, as seen with (x, y, z),
or you can use indexing with a dot notation to access individual elements directly.
Array Declarations
An array is a collection of objects of the same type, stored in contiguous memory locations. The length of the array is fixed, and must be known at compile time.
fn main() { let array = [1, 2, 3, 4, 5]; // type is [i32; 5] let first = array[0]; // accessing array elements }
These syntax elements, presented from simplest to more complex, provide a good foundational understanding for starting with Rust.
Rust's Standard Library
The Rust Standard Library is the foundation of portable Rust software, a set of minimal and battle-tested shared abstractions. It offers core types, like Vec<T> and Option<T>, library-defined operations on language primitives, standard macros, I/O and multithreading, among many other features.
Finding Documentation
-
Locally: If you have Rust installed via
rustup, you can access the local documentation with the following command:rustup doc --stdThis will open up the documentation in your default web browser.
-
Remotely: The official Rust documentation, including the Standard Library, can be found online at:
Essential Modules in Rust's Standard Library
1. std::io
Handles input and output functionality. Commonly used for reading from and writing to files, stdin, and stdout.
#![allow(unused)] fn main() { use std::io; }
2. std::fmt
Formatting and printing. Contains traits that dictate display and debug print behaviors.
#![allow(unused)] fn main() { use std::fmt; }
3. std::fs
Filesystem operations. Used for reading and writing files, directory manipulation, and more.
#![allow(unused)] fn main() { use std::fs; }
4. std::collections
A module that provides various data structures like HashMap, HashSet, VecDeque, etc.
#![allow(unused)] fn main() { use std::collections::HashMap; }
5. std::error
Error handling utilities. Provides the Error trait, which can be used to define custom error types.
#![allow(unused)] fn main() { use std::error::Error; }
6. std::thread
Multithreading and concurrency. Enables the creation and management of threads.
#![allow(unused)] fn main() { use std::thread; }
7. std::time
Time operations, like measuring durations or obtaining the current time.
#![allow(unused)] fn main() { use std::time::{Duration, Instant}; }
8. std::net
Networking operations, including TCP and UDP primitives.
#![allow(unused)] fn main() { use std::net::TcpListener; }
9. std::option and std::result
Enums representing optional values (Option<T>) and potential errors (Result<T, E>). They are fundamental to Rust's error handling and control flow.
#![allow(unused)] fn main() { use std::option::Option; use std::result::Result; }
10. std::str and std::string
String and string slice types and associated functions.
#![allow(unused)] fn main() { use std::str; use std::string::String; }
Rust's String Type
While we are already touching upon the topic of strings, we should actually introduce them.
Rust has many String types, but for us, only two are important - the str primitive type and
the String standard library type.
Unlike the str (aka string slice), a String is growable and allows modification. It's UTF-8 encoded,
ensuring any valid String will be properly encoded Unicode data.
Creating a String
-
From a Literal: Use
to_string()to create aStringfrom a string literal.#![allow(unused)] fn main() { let my_string = "Hello, world!".to_string(); } -
From a String Slice: You can also create it directly from a string slice (
str) using thefromfunction.#![allow(unused)] fn main() { let my_string = String::from("Hello, world!"); }
Manipulating a String
-
Appending: You can append to a
Stringusingpush_strorpush.#![allow(unused)] fn main() { let mut hello = String::from("Hello, "); hello.push_str("world!"); // Append a str hello.push('!'); // Append a char } -
Concatenation:
Stringcan be concatenated using the+operator or theformat!macro.#![allow(unused)] fn main() { let hello = String::from("Hello, "); let world = "world!"; let hello_world = hello + world; }Note: When using
+, the left operand gets moved and cannot be used again. -
Indexing:
Stringdoes not support indexing directly because it’s encoded in UTF-8, which does not have constant-time indexing.
Converting Between String and str
-
You can create a string slice by referencing a
String.#![allow(unused)] fn main() { let string_slice: &str = &my_string; }
Unicode and UTF-8 Encoding
Stringholds UTF-8 bytes and ensures the data is valid UTF-8, enabling the representation of a wide range of characters from various languages and symbols.
Accessing Bytes and Characters
-
To iterate over Unicode scalar values (char), use
chars():#![allow(unused)] fn main() { for c in my_string.chars() { println!("{}", c); } } -
To iterate over bytes, use
bytes():#![allow(unused)] fn main() { for b in my_string.bytes() { println!("{}", b); } }
Memory and Allocation
Stringis allocated on the heap, and it can dynamically grow or shrink as needed.- Memory is automatically reclaimed when
Stringgoes out of scope, thanks to Rust’s ownership system and the drop trait.
Useful Methods
len(): Get the length in bytes.is_empty(): Check if theStringis empty.split_whitespace(): Iterator over words.replace(from, to): Replace a substring.
Where to Find More Information
You can find more details in the Rust documentation:
Homework
For the lesson "Rust Basics: Syntax and Variables", we're building upon the foundational concepts you've learned and applying them to a practical task.
Your assignment is to write a program that reads from standard input, transmutes text according to the provided specification, and prints the result back to the user. The behavior of the program should be modified based on parsed CLI arguments.
Description:
In this (still very simple) exercise, you'll be using Rust's string manipulation capabilities. Here's what you need to do:
-
Setting up the Crate:
- Add the
slugcrate to your Cargo project to help with theslugifyfeature. To do this, open yourCargo.tomlfile and under the[dependencies]section, add:slug = "latest_version". (Replace "latest_version" with the most recent version number from crates.io, which is 0.1.4) - Once added, you can use it in your project by adding
use slug::slugify;at the top of your main Rust file. View the crate's documentation to see how to use it: https://docs.rs/slug/0.1.4/slug/
- Add the
-
Read Input:
- Read a string from the standard input.
-
Parse CLI Arguments:
- Based on the provided CLI argument, the program should modify the text's behavior. Use the
std::env::args()method to collect CLI arguments:
- Based on the provided CLI argument, the program should modify the text's behavior. Use the
use std::env; fn main() { let args: Vec<String> = env::args().collect(); println!("{}", args[0]); }
Note that the .len() and .is_empty() methods are available on Vector<String> to help you figure out,
if you received the necessary parameters.
- Transmute Text:
- If the argument is
lowercase, convert the entire text to lowercase. - If the argument is
uppercase, convert the entire text to uppercase. - If the argument is
no-spaces, remove all spaces from the text. - If the argument is
slugify, convert the text into a slug (a version of the text suitable for URLs) using theslugcrate.
- If the argument is
For one bonus point, try making two additional transformations of your own.
- Print Result:
- Print the transmuted text back to the user.
Hint: For string manipulations, the Rust standard library provides handy methods like:
to_lowercase()to_uppercase()replace(" ", "")
Submission:
- After implementing and testing your program, commit your changes and push the updated code to the GitHub repository you created for the previous homework.
- Submit the link to your updated GitHub repository on our class submission platform, ensuring your repository remains public for access and review.
Deadline:
- Please complete and submit this assignment by Tuesday, October 16. Attach the link to the Github repository again to the Google Classroom assignment.
By the end of this exercise, you'll tried out string manipulations, using external crates, and managing your Rust projects with Cargo. Should you face any hurdles or have questions, don't hesitate to ask or consult the Rust documentation. All of these will be super important in the future.
Forge ahead, and happy coding!