macos tlv workaround
fixes: #60141
Includes:
* remove dead code: `requires_move_before_drop`. This hasn't been needed for a while now (oops I should have removed it in #57655)
* redox had a copy of `fast::Key` (not sure why?). That has been removed.
* Perform a `read_volatile` on OSX to reduce `tlv_get_addr` calls per `__getit` from (4-2 depending on context) to 1.
`tlv_get_addr` is relatively expensive (~1.5ns on my machine).
Previously, in contexts where `__getit` was inlined, 4 calls to `tlv_get_addr` were performed per lookup. For some reason when `__getit` is not inlined this is reduced to 2x - and performance improves to match.
After this PR, I have only ever seen 1x call to `tlv_get_addr` per `__getit`, and macos now benefits from situations where `__getit` is inlined.
I'm not sure if the `read_volatile(&&__KEY)` trick is working around an LLVM bug, or a rustc bug, or neither.
r? @alexcrichton