Support for disabling PLT for better function call performance
This PR gives `rustc` the ability to skip the PLT when generating function calls into shared libraries. This can improve performance by reducing branch indirection.
AFAIK, the only advantage of using the PLT is to allow for ELF lazy binding. However, since Rust already [enables full relro for security](https://github.com/rust-lang/rust/pull/43170), lazy binding was disabled anyway.
This is a little known feature which is supported by [GCC](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html) and [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fplt) as `-fno-plt` (some Linux distros [enable it by default](https://git.archlinux.org/svntogit/packages.git/tree/trunk/makepkg.conf?h=packages/pacman#n40) for all builds).
Implementation inspired by [this patch](https://reviews.llvm.org/D39079#change-YvkpNDlMs_LT) which adds `-fno-plt` support to Clang.
## Performance
I didn't run a lot of benchmarks, but these are the results on my machine for a `clap` [benchmark](https://github.com/clap-rs/clap/blob/master/benches/05_ripgrep.rs):
```
name control ns/iter no-plt ns/iter diff ns/iter diff % speedup
build_app_long 11,097 10,733 -364 -3.28% x 1.03
build_app_short 11,089 10,742 -347 -3.13% x 1.03
build_help_long 186,835 182,713 -4,122 -2.21% x 1.02
build_help_short 80,949 78,455 -2,494 -3.08% x 1.03
parse_clean 12,385 12,044 -341 -2.75% x 1.03
parse_complex 19,438 19,017 -421 -2.17% x 1.02
parse_lots 431,493 421,421 -10,072 -2.33% x 1.02
```
A small performance improvement across the board, with no downsides. It's likely binaries which make a lot of function calls into dynamic libraries could see even more improvements. [This comment](https://patchwork.ozlabs.org/patch/468993/#1028255) suggests that, in some cases, `-fno-plt` could improve PIC/PIE code performance by 10%.
## Security benefits
**Bonus**: some of the speculative execution attacks rely on the PLT, by disabling it we reduce a big attack surface and reduce the need for [`retpoline`](https://reviews.llvm.org/D41723).
## Remaining PLT calls
The compiled binaries still have plenty of PLT calls, coming from C/C++ libraries. Building dependencies with `CFLAGS=-fno-plt CXXFLAGS=-fno-plt` removes them.
Disable the PLT where possible to improve performance
for indirect calls into shared libraries.
This optimization is enabled by default where possible.
- Add the `NonLazyBind` attribute to `rustllvm`:
This attribute informs LLVM to skip PLT calls in codegen.
- Disable PLT unconditionally:
Apply the `NonLazyBind` attribute on every function.
- Only enable no-plt when full relro is enabled:
Ensures we only enable it when we have linker support.
- Add `-Z plt` as a compiler option
Fix#54707 - parse_trait_item_ now handles interpolated blocks as function body decls
Fix#54707 - parse_trait_item_ now handles interpolated blocks as function body decls
Previously parsing trait items only handled opening brace token and semicolon, I added a branch to the match statement that will also handle interpolated blocks.
Better Diagnostic for Trait Object Capture
Part of #52663.
This commit enhances `LaterUseKind` detection to identify when a borrow
is captured by a trait object which helps explain why there is a borrow
error.
r? @nikomatsakis
cc @pnkfelix
codegen_llvm: verify that inline assembly operands are scalars
Another set of inline assembly fixes. This time let's emit an error message when the operand value cannot be coerced into the operand constraint.
Two questions:
1) Should I reuse `E0668` which was introduced in #54568 or just use `E0669` as it stands because they do mean different things, but maybe that's not too user-friendly. Just a thought.
2) The `try_fold` returns the operand which failed to be converted into a scalar value, any suggestions on how to use that in the error message?
Thanks!
NLL is missing struct field suggestion
Part of #52663.
This commit adds suggestions to change the definitions of fields in
struct definitions from immutable references to mutable references.
r? @nikomatsakis
cc @pnkfelix
Run debuginfo tests against rust-enabled lldb, when possible
If the rust-enabled lldb was built, then use it when running the
debuginfo tests. Updating the lldb submodule was necessary as this
needed a way to differentiate the rust-enabled lldb, so I added a line
to the --version output.
This adds compiletest commands to differentiate between the
rust-enabled and non-rust-enabled lldb, as is already done for gdb. A
new "rust-lldb" header directive is also added, but not used in this
patch; I plan to use it in #54004.
This updates all the tests.
Fix range literals borrowing suggestions
Fixes#54505. The compiler issued incorrect range borrowing suggestions (missing `()` around borrows of range literals). This was not correct syntax (see the issue for an example).
With changes in this PR, this is fixed for all types of `Range` literals.
Thanks again to @varkor and @estebank for their invaluable help and guidance.
r? @varkor
Prepare miri engine for enforcing validity invariant during execution
In particular, make recursive checking of references optional, and add a `const_mode` parameter that says whether `usize` is allowed to contain a pointer. Also refactor validation a bit to be type-driven at the "leafs" (primitive types), and separately validate scalar layout to catch `NonNull` violations (which it did not properly validate before).
Fixes https://github.com/rust-lang/rust/issues/53826
Also fixes https://github.com/rust-lang/rust/issues/54751
r? @oli-obk
Now when a `FnMut` closure is returning a closure that contains a
reference to a captured variable, we provide an error that makes it more
clear what is happening.
[NLL] Improve closure region bound errors
Previously, we would report free region errors that originate from closure with the span of the closure and a "closure body requires ..." message. This is now updated to use a reason and span from inside the closure.
If the rust-enabled lldb was built, then use it when running the
debuginfo tests. Updating the lldb submodule was necessary as this
needed a way to differentiate the rust-enabled lldb, so I added a line
to the --version output.
This adds compiletest commands to differentiate between the
rust-enabled and non-rust-enabled lldb, as is already done for gdb. A
new "rust-lldb" header directive is also added, but not used in this
patch; I plan to use it in #54004.
This updates all the tests.
Fix dead code lint for functions using impl Trait
Fixes https://github.com/rust-lang/rust/issues/54754
This is a minimal fix that doesn't add any new queries or touches unnecessary code. Please nominate for beta backport if wanted.
This commit updates the captured trait object search logic to look for
unsized casts to boxed types rather than for functions that returned
trait objects.
rustc: Allow `#[no_mangle]` anywhere in a crate
This commit updates the compiler to allow the `#[no_mangle]` (and
`#[export_name]` attributes) to be located anywhere within a crate.
These attributes are unconditionally processed, causing the compiler to
always generate an exported symbol with the appropriate name.
After some discussion on #54135 it was found that not a great reason
this hasn't been allowed already, and it seems to match the behavior
that many expect! Previously the compiler would only export a
`#[no_mangle]` symbol if it were *publicly reachable*, meaning that it
itself is `pub` and it's otherwise publicly reachable from the root of
the crate. This new definition is that `#[no_mangle]` *is always
reachable*, no matter where it is in a crate or whether it has `pub` or
not.
This should make it much easier to declare an exported symbol with a
known and unique name, even when it's an internal implementation detail
of the crate itself. Note that these symbols will persist beyond LTO as
well, always making their way to the linker.
Along the way this commit removes the `private_no_mangle_functions` lint
(also for statics) as there's no longer any need to lint these
situations. Furthermore a good number of tests were updated now that
symbol visibility has been changed.
Closes#54135