The strategy is this:
- we compute SCCs once all outlives constraints are known
- we allocate a set of values **per region** for storing liveness
- we allocate a set of values **per SCC** for storing the final values
- when we add a liveness constraint to the region R, we also add it
to the final value of the SCC to which R belongs
- then we can apply the constraints by just walking the DAG for the
SCCs and union'ing the children (which have their liveness
constraints within)
There are a few intermediate refactorings that I really ought to have
broken out into their own commits:
- reverse the constraint graph so that `R1: R2` means `R1 -> R2` and
not `R2 -> R1`. This fits better with the SCC computation and new
style of inference (`->` now means "take value from" and not "push
value into")
- this does affect some of the UI tests, since they traverse the
graph, but mostly the artificial ones and they don't necessarily
seem worse
- put some things (constraint set, etc) into `Rc`. This lets us root
them to permit mutation and iteration. It also guarantees they don't
change, which is critical to the correctness of the algorithm.
- Generalize various helpers that previously operated only on points
to work on any sort of region element.
The current situation is something of a mess.
- `IdxSetBuf` derefs to `IdxSet`.
- `IdxSetBuf` implements `Clone`, and therefore has a provided `clone_from`
method, which does allocation and so is expensive.
- `IdxSet` has a `clone_from` method that is non-allocating and therefore
cheap, but this method is not from the `Clone` trait.
As a result, if you have an `IdxSetBuf` called `b`, if you call
`b.clone_from(b2)` you'll get the expensive `IdxSetBuf` method, but if you call
`(*b).clone_from(b2)` you'll get the cheap `IdxSetBuf` method.
`liveness_of_locals()` does the former, presumably unintentionally, and
therefore does lots of unnecessary allocations.
Having a `clone_from` method that isn't from the `Clone` trait is a bad idea in
general, so this patch renames it as `overwrite`. This avoids the unnecessary
allocations in `liveness_of_locals()`, speeding up most NLL benchmarks, the
best by 1.5%. It also means that calls of the form `(*b).clone_from(b2)` can be
rewritten as `b.overwrite(b2)`.
Obligation forest cleanup
While looking at this code I was scratching my head about whether a node could appear in both `parent` and `dependents`. Turns out it can, but it's not useful to do so, so this PR cleans things up so it's no longer possible.
Improve memoization and refactor NLL type check
I have a big branch that is refactoring NLL type check with the goal of introducing canonicalization-based memoization for all of the operations it does. This PR contains an initial prefix of that branch which, I believe, stands alone. It does introduce a few smaller optimizations of its own:
- Skip operations that are trivially a no-op
- Cache the results of the dropck-outlives computations done by liveness
- Skip resetting unifications if nothing changed
r? @pnkfelix
Speed up obligation forest code
Here are the rustc-perf benchmarks that get at least a 1% speedup on one or more of their runs with these patches applied:
```
inflate-check
avg: -8.7% min: -12.1% max: 0.0%
inflate
avg: -5.9% min: -8.6% max: 1.1%
inflate-opt
avg: -1.5% min: -2.0% max: -0.3%
clap-rs-check
avg: -0.6% min: -1.9% max: 0.5%
coercions
avg: -0.2%? min: -1.3%? max: 0.6%?
serde-opt
avg: -0.6% min: -1.0% max: 0.1%
coercions-check
avg: -0.4%? min: -1.0%? max: -0.0%?
```
We used to accumulate a vector of type-check constraints, but now we
generate them as we go, and just store the NLL-style
`OutlivesConstraint` and `TypeTest` information.
Avoid useless Vec clones in pending_obligations().
The only instance of `ObligationForest` in use has an obligation type of
`PendingPredicateObligation`, which contains a `PredicateObligation` and a
`Vec<Ty>`.
`FulfillmentContext::pending_obligations()` calls
`ObligationForest::pending_obligations()`, which clones all the
`PendingPredicateObligation`s. But the `Vec<Ty>` field of those cloned
obligations is never touched.
This patch changes `ObligationForest::pending_obligations()` to
`map_pending_obligations` -- which gives callers control about which part
of the obligation to clone -- and takes advantage of the change to avoid
cloning the `Vec<Ty>`. The change speeds up runs of a few rustc-perf
benchmarks, the best by 1%.
The only instance of `ObligationForest` in use has an obligation type of
`PendingPredicateObligation`, which contains a `PredicateObligation` and a
`Vec<Ty>`.
`FulfillmentContext::pending_obligations()` calls
`ObligationForest::pending_obligations()`, which clones all the
`PendingPredicateObligation`s. But the `Vec<Ty>` field of those cloned
obligations is never touched.
This patch changes `ObligationForest::pending_obligations()` to
`map_pending_obligations` -- which gives callers control about which part
of the obligation to clone -- and takes advantage of the change to avoid
cloning the `Vec<Ty>`. The change speeds up runs of a few rustc-perf
benchmarks, the best by 1%.
stabilize RangeBounds collections_range #30877
The FCP for #30877 closed last month, with the decision to:
1. move from `collections::range::RangeArgument` to `ops::RangeBounds`, and
2. rename `start()` and `end()` to `start_bounds()` and `end_bounds()`.
Simon Sapin already moved it to `ops::RangeBounds` in #49163.
I renamed the functions, and removed the old `collections::range::RangeArgument` alias.
This is my first Rust PR, please let me know if I can improve anything. This passes all tests for me, except the `clippy` tool (which uses `RangeArgument::start()`).
I considered deprecating `start()` and `end()` instead of removing them, but the contribution guidelines indicate we can break `clippy` temporarily. I thought it was best to remove the functions, since we're worried about name collisions with `Range::start` and `end`.
Closes#30877.
implement the chalk-engine traits
Preliminary implementation for the Chalk traits in rustc. Lots of `panic!()` placeholders to be filled in later.
This is currently blocked on us landing https://github.com/rust-lang-nursery/chalk/pull/131 in chalk and issuing a new release, which should occur later today.
r? @scalexm
cc @leodasvacas
stop considering location when computing outlives relationships
This doesn't (yet?) use SEME regions, but it does ignore the location for outlives constraints. This makes (I believe) NLL significantly faster -- but we should do some benchmarks. It regresses the "get-default" family of use cases for NLL, which is a shame, but keeps the other benefits, and thus represents a decent step forward.
r? @pnkfelix
Speed up `opt_normalize_projection_type`
`opt_normalize_projection_type` is hot in the serde and futures benchmarks in rustc-perf. These two patches speed up the execution of most runs for them by 2--4%.
There is a hot path through `opt_normalize_projection_type`:
- `try_start` does a cache lookup (#1).
- The result is a `NormalizedTy`.
- There are no unresolved type vars, so we call `complete`.
- `complete` does *another* cache lookup (#2), then calls
`SnapshotMap::insert`.
- `insert` does *another* cache lookup (#3), inserting the same value
that's already in the cache.
This patch optimizes this hot path by introducing `complete_normalized`,
for use when the value is known in advance to be a `NormalizedTy`. It
always avoids lookup #2. Furthermore, if the `NormalizedTy`'s
obligations are empty (the common case), we know that lookup #3 would be
a no-op, so we avoid it, while inserting a Noop into the `SnapshotMap`'s
undo log.
It was introduced in #50240 to avoid an allocation when creating a new
BTreeMap, which gave some speed-ups. But then #50352 made that the
default behaviour for BTreeMap, so LazyBTreeMap is no longer necessary.