Unchecked thread spawning
# Summary
Add an unsafe interface for spawning lifetime-unrestricted threads for
library authors to build less-contrived, less-hacky safe abstractions
on.
# Motivation
So a few years back scoped threads were entirely removed from the Rust
stdlib, the reason being that it was possible to leak the scoped thread's
join guards without resorting to unsafe code, which meant the concept
was not completely safe, either.
Only a maximally-restrictive safe API for thread spawning was kept in the
stdlib, that requires `'static` lifetime bounds on both the thread closure
and its return type.
A number of 3rd party libraries sprung up to offer their implementations
for safe scoped threads implementations.
These work by essentially hiding the join guards from the user, thus
forcing them to join at the end of an (internal) function scope.
However, since these libraries have to use the maximally restrictive
thread spawning API, they have to resort to some very contrived manipulations
and subversions of Rust's type system to basically achieve what this commit does
with some minimal restructuring of the current code and exposing a new unsafe
function signature for spawning threads without lifetime restrictions.
Obviously this is unsafe, but its main use would be to allow library authors
to write safe abstractions with and around it.
To further illustrate my point, here's a quick summary of the hoops that,
for instance `crossbeam`, has to jump through to spawn a lifetime unrestricted
thread, all of which would not be necessary if an unsafe API existed as part
of the stdlib:
1. Allocate an `Arc<Option<T>>` on the heap where the result with type
`T: 'a` will go (in practice requires `Mutex` or `UnsafeCell` as well).
2. Wrap the desired thread closure with lifetime bound `'a` into another
closure (also `..: 'a`) that returns `()`, executes the inner closure and
writes its result into the pre-allocated `Option<T>`.
3. Box the wrapping closure, cast it to a trait object (`FnBox`) and
(unsafely) transmute its lifetime bound from `'a` to `'static`.
So while this new `spawn_unchecked` function is certainly not very relevant
for general use, since scoped threads are so common I think it makes sense
to expose an interface for libraries implementing these to build on.
The changes implemented are also very minimal: The current `spawn` function
(which internally contains unsafe code) is moved into an unsafe `spawn_unchecked`
function, which the safe function then wraps around.
# Issues
- ~~so far, no documentation for the new function (yet)~~
- the name of the function might be controversial, as `*_unchecked` more commonly
indicates that some sort of runtime check is omitted (`unrestricted` may be
more fitting)
- if accepted, it might make sense to add a freestanding `thread::spawn_unchecked`
function similar to the current `thread::spawn` for convenience.
Clarified code example in char primitive doc
The example was not as clear as it could be because it was making an assumption about the structure of the data in order to multiply the number of elements in the slice by the item size. This change demonstrates the idea more straightforwardly, without needing a calculation, by just comparing the size of the slices.
Documents `From` implementations for `Stdio`
This PR solves part of #51430 by adding a basic summary and an example to each `impl From` inside `process` module (`ChildStdin`, `ChildStdout`, `ChildStderr`, `File`).
It does not document if the conversions allocate memory and how expensive they are.
Gradually expanding libstd's keyword documentation
I'm working on adding new keywords to the documentation and refreshing the incomplete older ones, and I'm hoping that I can eventually add all the standalone-usable keywords after a bunch of incremental work. It would be cool to see the keywords section of std's docs be a definitive reference as to what each keyword means when you see it, and that's what I'm aiming towards with this work.
I'm far from a Rust expert so there will inevitably be things to fix in this, also I'm not sure if this should be a bunch of quickly-merged PRs or one gradually-updated PR that gets merged once it's done.
The example was not as clear as it could be because it was making an assumption about the structure of the data in order to multiply the number of collection elements by the item size. This change demonstrates the idea more straightforwardly, without the calculation.
Prefer unwrap_or_else to unwrap_or in case of function calls/allocations
The contents of `unwrap_or` are evaluated eagerly, so it's not a good pick in case of function calls and allocations. This PR also changes a few `unwrap_or`s with `unwrap_or_default`.
An added bonus is that in some cases this change also reveals if the object it's called on is an `Option` or a `Result` (based on whether the closure takes an argument).
Add a `copysign` function to f32 and f64
This patch adds a `copysign` function to the float primitive types. It is an exceptionally useful function for writing efficient numeric code, as it often avoids branches, is auto-vectorizable, and there are efficient intrinsics for most platforms.
I think this might work as-is, as the relevant `copysign` intrinsic is already used internally for the implementation of `signum`. It's possible that an implementation might be needed in japaric/libm for portability across all platforms, in which case I'll do that also.
Part of the work towards #55107
This patch adds a `copysign` function to the float primitive types.
It is an exceptionally useful function for writing efficient numeric
code, as it often avoids branches, is auto-vectorizable, and there
are efficient intrinsics for most platforms.
I think this might work as-is, as the relevant `copysign` intrinsic
is already used internally for the implementation of `signum`. It's
possible that an implementation might be needed in japaric/libm for
portability across all platforms, in which case I'll do that also.
Part of the work towards #55107
std: Implement TLS for wasm32-unknown-unknown
This adds an implementation of thread local storage for the
`wasm32-unknown-unknown` target when the `atomics` feature is
implemented. This, however, comes with a notable caveat of that it
requires a new feature of the standard library, `wasm-bindgen-threads`,
to be enabled.
Thread local storage for wasm (when `atomics` are enabled and there's
actually more than one thread) is powered by the assumption that an
external entity can fill in some information for us. It's not currently
clear who will fill in this information nor whose responsibility it
should be long-term. In the meantime there's a strategy being gamed out
in the `wasm-bindgen` project specifically, and the hope is that we can
continue to test and iterate on the standard library without committing
to a particular strategy yet.
As to the details of `wasm-bindgen`'s strategy, LLVM doesn't currently
have the ability to emit custom `global` values (thread locals in a
`WebAssembly.Module`) so we leverage the `wasm-bindgen` CLI tool to do
it for us. To that end we have a few intrinsics, assuming two global values:
* `__wbindgen_current_id` - gets the current thread id as a 32-bit
integer. It's `wasm-bindgen`'s responsibility to initialize this
per-thread and then inform libstd of the id. Currently `wasm-bindgen`
performs this initialization as part of the `start` function.
* `__wbindgen_tcb_{get,set}` - in addition to a thread id it's assumed
that there's a global available for simply storing a pointer's worth
of information (a thread control block, which currently only contains
thread local storage). This would ideally be a native `global`
injected by LLVM, but we don't have a great way to support that right
now.
To reiterate, this is all intended to be unstable and purely intended
for testing out Rust on the web with threads. The story is very likely
to change in the future and we want to make sure that we're able to do
that!
doc fix: it's auto traits that make for automatic implementations
Being a marker trait is not good enough (that just means "no items in the trait").
r? @alexcrichton who [originally wrote these docs](0a13f1abaf).
Documents reference equality by address (#54197)
Clarification of the use of `ptr::eq` to test equality of references via address by pointer coercion, regarding issue #54197 .
The same example as in `ptr::eq` docs is shown here to clarify that
`PartialEq` compares values pointed-to instead of via address (which can be desired in some cases)
This adds an implementation of thread local storage for the
`wasm32-unknown-unknown` target when the `atomics` feature is
implemented. This, however, comes with a notable caveat of that it
requires a new feature of the standard library, `wasm-bindgen-threads`,
to be enabled.
Thread local storage for wasm (when `atomics` are enabled and there's
actually more than one thread) is powered by the assumption that an
external entity can fill in some information for us. It's not currently
clear who will fill in this information nor whose responsibility it
should be long-term. In the meantime there's a strategy being gamed out
in the `wasm-bindgen` project specifically, and the hope is that we can
continue to test and iterate on the standard library without committing
to a particular strategy yet.
As to the details of `wasm-bindgen`'s strategy, LLVM doesn't currently
have the ability to emit custom `global` values (thread locals in a
`WebAssembly.Module`) so we leverage the `wasm-bindgen` CLI tool to do
it for us. To that end we have a few intrinsics, assuming two global values:
* `__wbindgen_current_id` - gets the current thread id as a 32-bit
integer. It's `wasm-bindgen`'s responsibility to initialize this
per-thread and then inform libstd of the id. Currently `wasm-bindgen`
performs this initialization as part of the `start` function.
* `__wbindgen_tcb_{get,set}` - in addition to a thread id it's assumed
that there's a global available for simply storing a pointer's worth
of information (a thread control block, which currently only contains
thread local storage). This would ideally be a native `global`
injected by LLVM, but we don't have a great way to support that right
now.
To reiterate, this is all intended to be unstable and purely intended
for testing out Rust on the web with threads. The story is very likely
to change in the future and we want to make sure that we're able to do
that!
rustc: Allow `#[no_mangle]` anywhere in a crate
This commit updates the compiler to allow the `#[no_mangle]` (and
`#[export_name]` attributes) to be located anywhere within a crate.
These attributes are unconditionally processed, causing the compiler to
always generate an exported symbol with the appropriate name.
After some discussion on #54135 it was found that not a great reason
this hasn't been allowed already, and it seems to match the behavior
that many expect! Previously the compiler would only export a
`#[no_mangle]` symbol if it were *publicly reachable*, meaning that it
itself is `pub` and it's otherwise publicly reachable from the root of
the crate. This new definition is that `#[no_mangle]` *is always
reachable*, no matter where it is in a crate or whether it has `pub` or
not.
This should make it much easier to declare an exported symbol with a
known and unique name, even when it's an internal implementation detail
of the crate itself. Note that these symbols will persist beyond LTO as
well, always making their way to the linker.
Along the way this commit removes the `private_no_mangle_functions` lint
(also for statics) as there's no longer any need to lint these
situations. Furthermore a good number of tests were updated now that
symbol visibility has been changed.
Closes#54135
This commit updates the compiler to allow the `#[no_mangle]` (and
`#[export_name]` attributes) to be located anywhere within a crate.
These attributes are unconditionally processed, causing the compiler to
always generate an exported symbol with the appropriate name.
After some discussion on #54135 it was found that not a great reason
this hasn't been allowed already, and it seems to match the behavior
that many expect! Previously the compiler would only export a
`#[no_mangle]` symbol if it were *publicly reachable*, meaning that it
itself is `pub` and it's otherwise publicly reachable from the root of
the crate. This new definition is that `#[no_mangle]` *is always
reachable*, no matter where it is in a crate or whether it has `pub` or
not.
This should make it much easier to declare an exported symbol with a
known and unique name, even when it's an internal implementation detail
of the crate itself. Note that these symbols will persist beyond LTO as
well, always making their way to the linker.
Along the way this commit removes the `private_no_mangle_functions` lint
(also for statics) as there's no longer any need to lint these
situations. Furthermore a good number of tests were updated now that
symbol visibility has been changed.
Closes#54135