implement `carryless_mul`
tracking issue: https://github.com/rust-lang/rust/issues/152080
ACP: https://github.com/rust-lang/libs-team/issues/738
This defers to LLVM's `llvm.clmul` when available, and otherwise falls back to a method from the `polyval` crate ([link](https://github.com/RustCrypto/universal-hashes/blob/master/polyval/src/field_element/soft/soft64.rs)).
Some things are missing, which I think we can defer:
- the ACP has some discussion about additional methods, but I'm not sure exactly what is wanted or how to implement it efficiently
- the SIMD intrinsic is not yet `const` (I think I ran into a bootstrapping issue). That is fine for now, I think in `stdarch` we can't really use this intrinsic at the moment, we'd only want the scalar version to replace some riscv intrinsics.
- the SIMD intrinsic is not implemented for the gcc and cranelift backends. That should be reasonably straightforward once we have a const eval implementation though.