_mm_cvtepi32_ps has been implemented, but _mm_cvtps_epi32 is missing.
Use the implementation of _mm_cvtepi32_ps as a guide for implementing
_mm_cvtps_epi32.
* Add vroundps, vceilps, vfloorps, vsqrtps, vsqrtpd
* Uninhibit assert_instr on non-expanded intrinsics
Also use the new simd_test macro
* Use simd_test where possible
* Add automated tests for vround*
* Add target_feature guards to automated tests
* Move automated tests below their functions
This commit switches the remaining "wrapper" tests to assert_instr with
constant parameters. This form of test is necessary when a vendor
intrinsic requires an immediate constant value to optimize properly into
the intended CPU instruction.
Some intrinsics need to be invoked with constant arguments to get the right
instruction to get generated, so this commit enhances the `assert_instr` macro
to enable this ability. Namely you pass constant arguments like:
#[assert_instr(foo, a = b)]
where this will assert that the intrinsic, when invoked with argument `a` equal
to the value `b` and all other arguments passed from the outside, will generate
the instruction `foo`.
Closes#49
This commit alters the test suite to unconditionally compile and run all tests,
regardless of the ambient target features enabled. This then uses a new
convenience macro, `#[simd_test]`, to guard all tests with the appropriate
`cfg_feature_enabled!` and also enable the `#[target_feature]` appropriately.
This commit adds CI for a few more targets:
* i686-unknown-linux-gnu
* arm-unknown-linux-gnueabihf
* armv7-unknown-linux-gnueabihf
* aarch64-unknown-linux-gnu
The CI here is structured around using a Docker container to set up a test
environment and then QEMU is used to actually execute code from these platforms.
QEMU's emulation actually makes it so we can continue to just use `cargo test`,
as processes can be spawned from QEMU like `objdump` and files can be read (for
libbacktrace). Ends up being a relatively seamless experience!
Note that a number of intrinsics were disabled on i686 because they were failing
tests, and otherwise a few ARM touch-ups were made to get tests passing.
This commit adds a procedural macro which can be used to test instruction
generation in a lightweight way. The intention is that all functions are
annotated with:
#[cfg_attr(test, assert_instr(maxps))]
fn foo(...) {
// ...
}
and then during `cargo test --release` it'll assert the function `foo` does
indeed generate the instruction `maxps`. This only activates tests in optimized
mode to avoid debug mode inefficiencies, and it uses a literal invocation of
`objdump` and some parsing to figure out what instructions are inside each
function. Finally it also uses the `backtrace` crate to figure out the symbol
name of the relevant function and hook that up to the output of `objdump`.
I added a few assertions in the `sse` module to get some feedback, but curious
what y'all think of this!