Use unicode-xid crate instead of libcore
This PR proposes to remove `char::is_xid_start` and `char::is_xid_continue` functions from `libcore` and use `unicode_xid` crate from crates.io (note that this crate is already present in rust-lang/rust's Cargo.lock).
Reasons to do this:
* removing rustc-binary-specific stuff from libcore
* making sure that, across the ecosystem, there's a single definition of what rust identifier is (`unicode-xid` has almost 10 million downs, as a `proc_macro2` dependency)
* making it easier to share `rustc_lexer` crate with rust-analyzer: no need to `#[cfg]` if we are building as a part of the compiler
Reasons not to do this:
* increased maintenance burden: we'll need to upgrade unicode version both in libcore and in unicode-xid. However, this shouldn't be a too heavy burden: just running `./unicode.py` after new unicode version. I (@matklad) am ready to be a t-compiler side maintainer of unicode-xid. Moreover, given that xid-unicode is an important dependency of syn, *someone* needs to maintain it anyway.
* xid-unicode implementation is significantly slower. It uses a more compact table with binary search, instead of a trie. However, this shouldn't matter in practice, because we have fast-path for ascii anyway, and code size savings is a plus. Moreover, in #59706 not using libcore turned out to be *faster*, presumably beacause checking for whitespace with match is even faster.
<details>
<summary>old description</summary>
Followup to #59706
r? @eddyb
Note that this doesn't actually remove tables from libcore, to avoid conflict with https://github.com/rust-lang/rust/pull/62641.
cc https://github.com/unicode-rs/unicode-xid/pull/11
</details>
They are only used by rustc_lexer, and are not needed elsewhere.
So we move the relevant definitions into rustc_lexer (while the actual
unicode data comes from the unicode-xid crate) and make the rest of
the compiler use it.
Add Result::cloned{,_err} and Result::copied{,_err}
This is a little nice addition to `Result`.
1. I'm not sure how useful are `cloned_err` and `copied_err`, but for the sake of completeness they are here.
2. Naming is similar to `map`/`map_err`. I thought about naming `cloned` as `cloned_ok` and add another method called `cloned` that clones both Ok and Err, but `cloned_ok` should be more prevalent than `cloned_both`.
Because of a compiler bug that adding `Self: ExactSizeIterator` makes
the compiler forget `Self::Item` is `<I as Iterator>::Item`, we remove
this specialization for now.
This allows lints and other diagnostics to refer to items
by a unique ID instead of relying on whacky path
resolution schemes that may break when items are
relocated.
Removed a confusing FnOnce example
# Description
See #47091 for a discussion.
## Changes
- Removed an example that might suggest readers that square_x is (only) FnOnce.
closes#47091
Improve the documentation for std::hint::black_box.
The other day a colleague was reviewing some of my code which was using `black_box` to block constant propogation. There was a little confusion because the documentation kind of implies that `black_box` is only useful for dead code elimination, and only in benchmarking scenarios.
The docs currently say:
> A function that is opaque to the optimizer, to allow benchmarks to pretend to use outputs to assist in avoiding dead-code elimination.
Here is our discussion, in which I show (using godbolt) that a black box can also block constant propagation:
https://github.com/softdevteam/yk/pull/21#discussion_r302985038
This change makes the docstring for `black_box` a little more general, and while we are here, I've added an example (the same one from our discussion).

OK to go in?
Audit uses of `apply_mark` in built-in macros + Remove default macro transparencies
Every use of `apply_mark` in a built-in or procedural macro is supposed to look like this
```rust
location.with_ctxt(SyntaxContext::root().apply_mark(ecx.current_expansion.id))
```
where `SyntaxContext::root()` means that the built-in/procedural macro is defined directly, rather than expanded from some other macro.
However, few people understood what `apply_mark` does, so we had a lot of copy-pasted uses of it looking e.g. like
```rust
span = span.apply_mark(ecx.current_expansion.id);
```
, which doesn't really make sense for procedural macros, but at the same time is not too harmful, if the macros use the traditional `macro_rules` hygiene.
So, to fight this, we stop using `apply_mark` directly in built-in macro implementations, and follow the example of regular proc macros instead and use analogues of `Span::def_site()` and `Span::call_site()`, which are much more intuitive and less error-prone.
- `ecx.with_def_site_ctxt(span)` takes the `span`'s location and combines it with a def-site context.
- `ecx.with_call_site_ctxt(span)` takes the `span`'s location and combines it with a call-site context.
Even if called multiple times (which sometimes happens due to some historical messiness of the built-in macro code) these functions will produce the same result, unlike `apply_mark` which will grow the mark chain further in this case.
---
After `apply_mark`s in built-in macros are eliminated, the remaining `apply_mark`s are very few in number, so we can start passing the previously implicit `Transparency` argument to them explicitly, thus eliminating the need in `default_transparency` fields in hygiene structures and `#[rustc_macro_transparency]` annotations on built-in macros.
So, the task of making built-in macros opaque can now be formulated as "eliminate `with_legacy_ctxt` in favor of `with_def_site_ctxt`" rather than "replace `#[rustc_macro_transparency = "semitransparent"]` with `#[rustc_macro_transparency = "opaque"]`".
r? @matthewjasper
All transparancies are passed explicitly now.
Also remove `#[rustc_macro_transparency]` annotations from built-in macros, they are no longer used.
`#[rustc_macro_transparency]` only makes sense for declarative macros now.
Fix bug in iter::Chain::size_hint
`Chain::size_hint` currently ignores `self.state`, which means that the size hints of the underlying iterators are always combined regardless of the iteration state. This, of course, should only happen when the state is `ChainState::Both`.
Add APIs for uninitialized Box, Rc, and Arc. (Plus get_mut_unchecked)
Assigning `MaybeUninit::<Foo>::uninit()` to a local variable is usually free, even when `size_of::<Foo>()` is large. However, passing it for example to `Arc::new` [causes at least one copy](https://youtu.be/F1AquroPfcI?t=4116) (from the stack to the newly allocated heap memory) even though there is no meaningful data. It is theoretically possible that a Sufficiently Advanced Compiler could optimize this copy away, but this is [reportedly unlikely to happen soon in LLVM](https://youtu.be/F1AquroPfcI?t=5431).
This PR proposes two sets of features:
* Constructors for containers (`Box`, `Rc`, `Arc`) of `MaybeUninit<T>` or `[MaybeUninit<T>]` that do not initialized the data, and unsafe conversions to the known-initialized types (without `MaybeUninit`). The constructors are guaranteed not to make unnecessary copies.
* On `Rc` and `Arc`, an unsafe `get_mut_unchecked` method that provides `&mut T` access without checking the reference count. `Arc::get_mut` involves multiple atomic operations whose cost can be non-trivial. `Rc::get_mut` is less costly, but we add `Rc::get_mut_unchecked` anyway for symmetry with `Arc`.
These can be useful independently, but they will presumably be typical when the new constructors of `Rc` and `Arc` are used.
An alternative with a safe API would be to introduce `UniqueRc` and `UniqueArc` types that have the same memory layout as `Rc` and `Arc` (and so zero-cost conversion to them) but are guaranteed to have only one reference. But introducing entire new types feels “heavier” than new constructors on existing types, and initialization of `MaybeUninit<T>` typically requires unsafe code anyway.
Summary of new APIs (all unstable in this PR):
```rust
impl<T> Box<T> { pub fn new_uninit() -> Box<MaybeUninit<T>> {…} }
impl<T> Box<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Box<T> {…} }
impl<T> Box<[T]> { pub fn new_uninit_slice(len: usize) -> Box<[MaybeUninit<T>]> {…} }
impl<T> Box<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Box<[T]> {…} }
impl<T> Rc<T> { pub fn new_uninit() -> Rc<MaybeUninit<T>> {…} }
impl<T> Rc<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Rc<T> {…} }
impl<T> Rc<[T]> { pub fn new_uninit_slice(len: usize) -> Rc<[MaybeUninit<T>]> {…} }
impl<T> Rc<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Rc<[T]> {…} }
impl<T> Arc<T> { pub fn new_uninit() -> Arc<MaybeUninit<T>> {…} }
impl<T> Arc<MaybeUninit<T>> { pub unsafe fn assume_init(self) -> Arc<T> {…} }
impl<T> Arc<[T]> { pub fn new_uninit_slice(len: usize) -> Arc<[MaybeUninit<T>]> {…} }
impl<T> Arc<[MaybeUninit<T>]> { pub unsafe fn assume_init(self) -> Arc<[T]> {…} }
impl<T: ?Sized> Rc<T> { pub unsafe fn get_mut_unchecked(this: &mut Self) -> &mut T {…} }
impl<T: ?Sized> Arc<T> { pub unsafe fn get_mut_unchecked(this: &mut Self) -> &mut T {…} }
```
Opaque builtin derive macros
* Buiilt-in derives are now opaque macros
* This required limiting the visibility of some previously unexposed functions in `core`.
* This also required the change to `Ident` serialization.
* All gensyms are replaced with hygienic identifiers
* Use hygiene to avoid most other name-resolution issues with buiilt-in derives.
* As far as I know the only remaining case that breaks is an ADT that has the same name as one of its parameters. Fixing this completely seemed to be more effort than it's worth.
* Remove gensym in `Ident::decode`, which lead to linker errors due to `inline` being gensymmed.
* `Ident`now panics if incremental compilation tries to serialize it (it currently doesn't).
* `Ident` no longer uses `gensym` to emulate cross-crate hygiene. It only applied to reexports.
* `SyntaxContext` is no longer serializable.
* The long-term fix for this is to properly implement cross-crate hygiene, but this seemed to be acceptable for now.
* Move type/const parameter shadowing checks to `resolve`
* This was previously split between resolve and type checking. The type checking pass compared `InternedString`s, not Identifiers.
* Removed the `SyntaxContext` from `{ast, hir}::{InlineAsm, GlobalAsm}`
cc #60869
r? @petrochenkov
Override Cycle::try_fold
It's not very pretty, but I believe this is the simplest way to correctly implement `Cycle::try_fold`. The following may seem correct:
```rust
loop {
acc = self.iter.try_fold(acc, &mut f)?;
self.iter = self.orig.clone();
}
```
...but this loops infinitely in case `self.orig` is empty, as opposed to returning `acc`. So we first have to fully iterate `self.orig` to check whether it is empty or not, and before _that_, we have to iterate the remaining elements of `self.iter`.
This should always call `self.orig.clone()` the same amount of times as repeated `next()` calls would.
r? @scottmcm
Rename overflowing_{add,sub,mul} intrinsics to wrapping_{add,sub,mul}.
These confused @Gankra, and then, also me, especially since `overflowing_*` *methods* also exist, but they map to `*_with_overflow` intrinsics!
r? @oli-obk / @nikomatsakis cc @Mark-Simulacrum (on the rustbuild workaround)