diff --git a/src/doc/guide.md b/src/doc/guide.md index c7b8e42b28cd..2a74eba0c332 100644 --- a/src/doc/guide.md +++ b/src/doc/guide.md @@ -3637,40 +3637,72 @@ pub fn as_maybe_owned(&self) -> MaybeOwned<'a> { ... } ## Boxes -All of our references so far have been to variables we've created on the stack. -In Rust, the simplest way to allocate heap variables is using a *box*. To -create a box, use the `box` keyword: +Most of the types we've seen so far have a fixed size or number of components. +The compiler needs this fact to lay out values in memory. However, some data +structures, such as a linked list, do not have a fixed size. You might think to +implement a linked list with an enum that's either a `Node` or the end of the +list (`Nil`), like this: -```{rust} -let x = box 5i; +```{rust,ignore} +enum List { // error: illegal recursive enum type + Node(u32, List), + Nil +} ``` -This allocates an integer `5` on the heap, and creates a binding `x` that -refers to it. The great thing about boxed pointers is that we don't have to -manually free this allocation! If we write +But the compiler complains that the type is recursive, that is, it could be +arbitrarily large. To remedy this, Rust provides a fixed-size container called +a **box** that can hold any type. You can box up any value with the `box` +keyword. Our boxed List gets the type `Box` (more on the notation when we +get to generics): + +```{rust} +enum List { + Node(u32, Box), + Nil +} + +fn main() { + let list = Node(0, box Node(1, box Nil)); +} +``` + +A box dynamically allocates memory to hold its contents. The great thing about +Rust is that that memory is *automatically*, *efficiently*, and *predictably* +deallocated when you're done with the box. + +A box is a pointer type, and you access what's inside using the `*` operator, +just like regular references. This (rather silly) example dynamically allocates +an integer `5` and makes `x` a pointer to it: ```{rust} { let x = box 5i; - // do stuff + println!("{}", *x); // Prints 5 } ``` -then Rust will automatically free `x` at the end of the block. This isn't -because Rust has a garbage collector -- it doesn't. Instead, when `x` goes out -of scope, Rust `free`s `x`. This Rust code will do the same thing as the -following C code: +The great thing about boxes is that we don't have to manually free this +allocation! Instead, when `x` reaches the end of its lifetime -- in this case, +when it goes out of scope at the end of the block -- Rust `free`s `x`. This +isn't because Rust has a garbage collector (it doesn't). Instead, by tracking +the ownership and lifetime of a variable (with a little help from you, the +programmer), the compiler knows precisely when it is no longer used. + +The Rust code above will do the same thing as the following C code: ```{c,ignore} { int *x = (int *)malloc(sizeof(int)); - // do stuff + if (!x) abort(); + *x = 5; + printf("%d\n", *x); free(x); } ``` -This means we get the benefits of manual memory management, but the compiler -ensures that we don't do something wrong. We can't forget to `free` our memory. +We get the benefits of manual memory management, while ensuring we don't +introduce any bugs. We can't forget to `free` our memory. Boxes are the sole owner of their contents, so you cannot take a mutable reference to them and then use the original box: @@ -3706,48 +3738,50 @@ let mut x = box 5i; *x; ``` +Boxes are simple and efficient pointers to dynamically allocated values with a +single owner. They are useful for tree-like structures where the lifetime of a +child depends solely on the lifetime of its (single) parent. If you need a +value that must persist as long as any of several referrers, read on. + ## Rc and Arc -Sometimes, you need to allocate something on the heap, but give out multiple -references to the memory. Rust's `Rc` (pronounced 'arr cee tee') and -`Arc` types (again, the `T` is for generics, we'll learn more later) provide -you with this ability. **Rc** stands for 'reference counted,' and **Arc** for -'atomically reference counted.' This is how Rust keeps track of the multiple -owners: every time we make a new reference to the `Rc`, we add one to its -internal 'reference count.' Every time a reference goes out of scope, we -subtract one from the count. When the count is zero, the `Rc` can be safely -deallocated. `Arc` is almost identical to `Rc`, except for one thing: The -'atomically' in 'Arc' means that increasing and decreasing the count uses a -thread-safe mechanism to do so. Why two types? `Rc` is faster, so if you're -not in a multi-threaded scenario, you can have that advantage. Since we haven't -talked about threading yet in Rust, we'll show you `Rc` for the rest of this -section. +Sometimes, you need a variable that is referenced from multiple places +(immutably!), lasting as long as any of those places, and disappearing when it +is no longer referenced. For instance, in a graph-like data structure, a node +might be referenced from all of its neighbors. In this case, it is not possible +for the compiler to determine ahead of time when the value can be freed -- it +needs a little run-time support. -To create an `Rc`, use `Rc::new()`: +Rust's **Rc** type provides shared ownership of a dynamically allocated value +that is automatically freed at the end of its last owner's lifetime. (`Rc` +stands for 'reference counted,' referring to the way these library types are +implemented.) This provides more flexibility than single-owner boxes, but has +some runtime overhead. -```{rust} -use std::rc::Rc; - -let x = Rc::new(5i); -``` - -To create a second reference, use the `.clone()` method: +To create an `Rc` value, use `Rc::new()`. To create a second owner, use the +`.clone()` method: ```{rust} use std::rc::Rc; let x = Rc::new(5i); let y = x.clone(); + +println!("{} {}", *x, *y); // Prints 5 5 ``` -The `Rc` will live as long as any of its references are alive. After they -all go out of scope, the memory will be `free`d. +The `Rc` will live as long as any of its owners are alive. After that, the +memory will be `free`d. -If you use `Rc` or `Arc`, you have to be careful about introducing -cycles. If you have two `Rc`s that point to each other, the reference counts -will never drop to zero, and you'll have a memory leak. To learn more, check -out [the section on `Rc` and `Arc` in the pointers -guide](guide-pointers.html#rc-and-arc). +**Arc** is an 'atomically reference counted' value, identical to `Rc` except +that ownership can be safely shared among multiple threads. Why two types? +`Arc` has more overhead, so if you're not in a multi-threaded scenario, you +don't have to pay the price. + +If you use `Rc` or `Arc`, you have to be careful about introducing cycles. If +you have two `Rc`s that point to each other, they will happily keep each other +alive forever, creating a memory leak. To learn more, check out [the section on +`Rc` and `Arc` in the pointers guide](guide-pointers.html#rc-and-arc). # Patterns