Commit graph

14780 commits

Author SHA1 Message Date
est31
c79aba71d5 40 -> 39, as ceil(log10(2^128)) == 39
just as ceil(log10(2^64)) == 20
2016-12-30 15:17:28 +01:00
est31
dc14a108ae Fix intrinsics and expand tests 2016-12-30 15:17:27 +01:00
Simonas Kazlauskas
7a3704c500 Fix rebase fallout
This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:27 +01:00
Simonas Kazlauskas
db2527add3 Fix parse-fail and compile-fail tests 2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
64fbce6826 Tidy
This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:26 +01:00
Simonas Kazlauskas
d4d5be18b7 Feature gate the 128 bit types
Dangling a carrot in front of a donkey.

This commit includes manual merge conflict resolution changes from a rebase by @est31.
2016-12-30 15:17:25 +01:00
Simonas Kazlauskas
b5260644af Tests for the 128 bit integers 2016-12-30 15:15:44 +01:00
Alex Crichton
bcfd504744 Rollup merge of #38559 - japaric:ptx2, r=alexcrichton
PTX support, take 2

- You can generate PTX using `--emit=asm` and the right (custom) target. Which
  then you can run on a NVIDIA GPU.

- You can compile `core` to PTX. [Xargo] also works and it can compile some
  other crates like `collections` (but I doubt all of those make sense on a GPU)

[Xargo]: https://github.com/japaric/xargo

- You can create "global" functions, which can be "called" by the host, using
  the `"ptx-kernel"` ABI, e.g. `extern "ptx-kernel" fn kernel() { .. }`. Every
  other function is a "device" function and can only be called by the GPU.

- Intrinsics like `__syncthreads()` and `blockIdx.x` are available as
  `"platform-intrinsics"`. These intrinsics are *not* in the `core` crate but
  any Rust user can create "bindings" to them using an `extern
  "platform-intrinsics"` block. See example at the end.

- Trying to emit PTX with `-g` (debuginfo); you get an LLVM error. But I don't
  think PTX can contain debuginfo anyway so `-g` should be ignored and a warning
  should be printed ("`-g` doesn't work with this target" or something).

- "Single source" support. You *can't* write a single source file that contains
  both host and device code. I think that should be possible to implement that
  outside the compiler using compiler plugins / build scripts.

- The equivalent to CUDA `__shared__` which it's used to declare memory that's
  shared between the threads of the same block. This could be implemented using
  attributes: `#[shared] static mut SCRATCH_MEMORY: [f32; 64]` but hasn't been
  implemented yet.

- Built-in targets. This PR doesn't add targets to the compiler just yet but one
  can create custom targets to be able to emit PTX code (see the example at the
  end). The idea is to have people experiment with this feature before
  committing to it (built-in targets are "insta-stable")

- All functions must be "inlined". IOW, the `.rlib` must always contain the LLVM
  bitcode of all the functions of the crate it was produced from. Otherwise, you
  end with "undefined references" in the final PTX code but you won't get *any*
  linker error because no linker is involved. IOW, you'll hit a runtime error
  when loading the PTX into the GPU. The workaround is to use `#[inline]` on
  non-generic functions and to never use `#[inline(never)]` but this may not
  always be possible because e.g. you could be relying on third party code.

- Should `--emit=asm` generate a `.ptx` file instead of a `.s` file?

TL;DR Use Xargo to turn a crate into a PTX module (a `.s` file). Then pass that
PTX module, as a string, to the GPU and run it.

The full code is in [this repository]. This section gives an overview of how to
run Rust code on a NVIDIA GPU.

[this repository]: https://github.com/japaric/cuda

- Create a custom target. Here's the 64-bit NVPTX target (NOTE: the comments
  are not valid because this is supposed to be a JSON file; remove them before
  you use this file):

``` js
// nvptx64-nvidia-cuda.json
{
  "arch": "nvptx64",  // matches LLVM
  "cpu": "sm_20",  // "oldest" compute capability supported by LLVM
  "data-layout": "e-i64:64-v16:16-v32:32-n16:32:64",
  "llvm-target": "nvptx64-nvidia-cuda",
  "max-atomic-width": 0,  // LLVM errors with any other value :-(
  "os": "cuda",  // matches LLVM
  "panic-strategy": "abort",
  "target-endian": "little",
  "target-pointer-width": "64",
  "target-vendor": "nvidia",  // matches LLVM -- not required
}
```

(There's a 32-bit target specification in the linked repository)

- Write a kernel

``` rust

extern "platform-intrinsic" {
    fn nvptx_block_dim_x() -> i32;
    fn nvptx_block_idx_x() -> i32;
    fn nvptx_thread_idx_x() -> i32;
}

/// Copies an array of `n` floating point numbers from `src` to `dst`
pub unsafe extern "ptx-kernel" fn memcpy(dst: *mut f32,
                                         src: *const f32,
                                         n: usize) {
    let i = (nvptx_block_dim_x() as isize)
        .wrapping_mul(nvptx_block_idx_x() as isize)
        .wrapping_add(nvptx_thread_idx_x() as isize);

    if (i as usize) < n {
        *dst.offset(i) = *src.offset(i);
    }
}
```

- Emit PTX code

```
$ xargo rustc --target nvptx64-nvidia-cuda --release -- --emit=asm
   Compiling core v0.0.0 (file://..)
   (..)
   Compiling nvptx-builtins v0.1.0 (https://github.com/japaric/nvptx-builtins)
   Compiling kernel v0.1.0

$ cat target/nvptx64-nvidia-cuda/release/deps/kernel-*.s
//
// Generated by LLVM NVPTX Back-End
//

.version 3.2
.target sm_20
.address_size 64

        // .globl       memcpy

.visible .entry memcpy(
        .param .u64 memcpy_param_0,
        .param .u64 memcpy_param_1,
        .param .u64 memcpy_param_2
)
{
        .reg .pred      %p<2>;
        .reg .s32       %r<5>;
        .reg .s64       %rd<12>;

        ld.param.u64    %rd7, [memcpy_param_2];
        mov.u32 %r1, %ntid.x;
        mov.u32 %r2, %ctaid.x;
        mul.wide.s32    %rd8, %r2, %r1;
        mov.u32 %r3, %tid.x;
        cvt.s64.s32     %rd9, %r3;
        add.s64         %rd10, %rd9, %rd8;
        setp.ge.u64     %p1, %rd10, %rd7;
        @%p1 bra        LBB0_2;
        ld.param.u64    %rd3, [memcpy_param_0];
        ld.param.u64    %rd4, [memcpy_param_1];
        cvta.to.global.u64      %rd5, %rd4;
        cvta.to.global.u64      %rd6, %rd3;
        shl.b64         %rd11, %rd10, 2;
        add.s64         %rd1, %rd6, %rd11;
        add.s64         %rd2, %rd5, %rd11;
        ld.global.u32   %r4, [%rd2];
        st.global.u32   [%rd1], %r4;
LBB0_2:
        ret;
}
```

- Run it on the GPU

``` rust
// `kernel.ptx` is the `*.s` file we got in the previous step
const KERNEL: &'static str = include_str!("kernel.ptx");

driver::initialize()?;

let device = Device(0)?;
let ctx = device.create_context()?;
let module = ctx.load_module(KERNEL)?;
let kernel = module.function("memcpy")?;

let h_a: Vec<f32> = /* create some random data */;
let h_b = vec![0.; N];

let d_a = driver::allocate(bytes)?;
let d_b = driver::allocate(bytes)?;

// Copy from host to GPU
driver::copy(h_a, d_a)?;

// Run `memcpy` on the GPU
kernel.launch(d_b, d_a, N)?;

// Copy from GPU to host
driver::copy(d_b, h_b)?;

// Verify
assert_eq!(h_a, h_b);

// `d_a`, `d_b`, `h_a`, `h_b` are dropped/freed here
```

---

cc @alexcrichton @brson @rkruppe

> What has changed since #34195?

- `core` now can be compiled into PTX. Which makes it very easy to turn `no_std`
  crates into "kernels" with the help of Xargo.

- There's now a way, the `"ptx-kernel"` ABI, to generate "global" functions. The
  old PR required a manual step (it was hack) to "convert" "device" functions
  into "global" functions. (Only "global" functions can be launched by the host)

- Everything is unstable. There are not "insta stable" built-in targets this
  time (\*). The users have to use a custom target to experiment with this
  feature. Also, PTX instrinsics, like `__syncthreads` and `blockIdx.x`, are now
  implemented as `"platform-intrinsics"` so they no longer live in the `core`
  crate.

(\*) I'd actually like to have in-tree targets because that makes this target
more discoverable, removes the need to lug around .json files, etc.

However, bundling a target with the compiler immediately puts it in the path
towards stabilization. Which gives us just two cycles to find and fix any
problem with the target specification. Afterwards, it becomes hard to tweak
the specification because that could be a breaking change.

A possible solution could be "unstable built-in targets". Basically, to use an
unstable target, you'll have to also pass `-Z unstable-options` to the compiler.
And unstable targets, being unstable, wouldn't be available on stable.

> Why should this be merged?

- To let people experiment with the feature out of tree. Having easy access to
  the feature (in every nightly) allows this. I also think that, as it is, it
  should be possible to start prototyping type-safe single source support using
  build scripts, macros and/or plugins.

- It's a straightforward implementation. No different that adding support for
  any other architecture.
2016-12-29 17:26:15 -08:00
Nick Cameron
71f161c887 Add a test for #38273
Closes 38273
2016-12-29 13:24:46 +13:00
Nick Cameron
7720cf02e3 Change --crate-type metadata to --emit=metadata 2016-12-29 13:24:45 +13:00
Eduard-Mihai Burtescu
4aae835803 rustc: always print nested nodes where a HIR map is available. 2016-12-28 11:29:20 +02:00
Eduard-Mihai Burtescu
f89856be6c rustc: move function arguments into hir::Body. 2016-12-28 11:29:19 +02:00
Eduard-Mihai Burtescu
e64f64a2fc rustc: separate bodies for static/(associated)const and embedded constants. 2016-12-28 11:27:57 +02:00
Eduard-Mihai Burtescu
864928297d rustc: separate TraitItem from their parent Item, just like ImplItem. 2016-12-28 11:21:45 +02:00
Eduard-Mihai Burtescu
6ebb6fdbee hir: lower ImplicitSelf to resolved Self TyQPath's. 2016-12-28 11:21:45 +02:00
bors
a9ab778815 Auto merge of #38479 - michaelwoerister:extern_mod_ich, r=nikomatsakis
ICH: Fix and test foreign mod hashing.

r? @nikomatsakis
2016-12-28 01:30:31 +00:00
bors
82801b552e Auto merge of #38600 - arielb1:dead-drop, r=eddyb
clear discriminant drop flag at the bottom of a drop ladder

Fixes #38437.

Beta-nominating because serious I-wrong.

r? @eddyb
2016-12-27 18:25:22 +00:00
bors
9351c2c8e7 Auto merge of #38537 - jseyfried:fix_rustdoc_ice, r=nrc
Fix ICE in rustdoc

Fixes #38237.
r? @nrc
2016-12-27 05:07:34 +00:00
bors
c2ee32ab45 Auto merge of #38507 - alexcrichton:travis-lldb, r=brson
travis: Update the OSX image we run tests in

The current image is `xcode7.3`, Travis's current default. Unfortunately this
has a version of LLDB which doesn't support debuginfo-lldb tests (see #32520),
so we're not running LLDB tests on Travis yet.

This switches us to the newest image from Travis, `xcode8.2`, which should have
a newer version of LLDB we can run tests against.
2016-12-27 02:18:20 +00:00
Jorge Aparicio
aac5ff7664 fix ui test 2016-12-26 21:06:23 -05:00
Alex Crichton
dad0076569 travis: Update the OSX image we run tests in
The current image is `xcode7.3`, Travis's current default. Unfortunately this
has a version of LLDB which doesn't support debuginfo-lldb tests (see #32520),
so we're not running LLDB tests on Travis yet.

This switches us to the newest image from Travis, `xcode8.2`, which should have
a newer version of LLDB we can run tests against.
2016-12-26 15:40:22 -08:00
bors
f536d90c78 Auto merge of #38542 - YaLTeR:fastcall-fix, r=pnkfelix
Fix fastcall not applying inreg attributes to arguments

Fixes https://github.com/rust-lang/rust/issues/18086
2016-12-26 17:23:42 +00:00
Vadim Petrochenkov
09aba18e10 More systematic error reporting in path resolution 2016-12-26 15:01:49 +03:00
Vadim Petrochenkov
3fb676afb0 Move some compile-fail tests into UI directory 2016-12-26 12:58:33 +03:00
bors
20b6c160ca Auto merge of #38539 - jseyfried:fix_resolve_hang, r=eddyb
resolve: fix non-termination

Fixes #34324.
r? @eddyb
2016-12-25 18:13:54 +00:00
Ariel Ben-Yehuda
521b2eaf7b clear discriminant drop flag at the bottom of a ladder
Fixes #38437.
2016-12-25 18:44:19 +02:00
bors
c74ac6cb97 Auto merge of #38566 - jseyfried:fix_import_resolution_bug, r=eddyb
Fix bug in import resolution

Fixes #38535 and fixes #38556.
r? @nrc
2016-12-25 13:14:12 +00:00
Steve Klabnik
8836a9da72 Rollup merge of #38557 - michaelwoerister:inline-asm-ich, r=nikomatsakis
incr. comp.: Improve InlineAsm hashing and add test case

r? @nikomatsakis
2016-12-24 14:29:32 -05:00
Steve Klabnik
0f0e74e0ec Rollup merge of #38502 - michaelwoerister:ich-test-inherent-impls, r=nikomatsakis
ICH: Add test cases for inherent impls.

r? @nikomatsakis
2016-12-24 14:29:25 -05:00
bors
8d65c8d64e Auto merge of #38268 - withoutboats:parse_where_higher_rank_hack, r=eddyb
Prevent where < ident > from parsing.

In order to be forward compatible with `where<'a>` syntax for higher
rank parameters, prevent potential conflicts with UFCS from parsing
correctly for the near term.
2016-12-24 00:22:00 +00:00
bors
4d07320d01 Auto merge of #38523 - camlorn:disable_field_reordering, r=nikomatsakis
Disable field reordering

This was decided via IRC and needs a backport to beta.  Basically, #37429 broke servo, and probably needs an announcement and opt-in flag.  I didn't run all tests locally but think I've already reverted all the ones that need to be reverted.

r? @nikomatsakis
2016-12-23 21:36:59 +00:00
bors
467a7f049b Auto merge of #38533 - jseyfried:legacy_custom_derive_deprecation, r=nrc
Allow legacy custom derive authors to disable warnings in downstream crates

This PR allows legacy custom derive authors to use a pre-deprecated method `registry.register_custom_derive()` instead of `registry.register_syntax_extension()` to avoid downstream deprecation warnings.

r? @nrc
2016-12-23 18:43:12 +00:00
bors
82611a0224 Auto merge of #38232 - jseyfried:refactor_global_paths, r=nrc
Refactor global paths

This PR removes the field `global: bool` from `ast::Path` and `hir::Path`, instead representing a global path `::foo::bar` as `{{root}}::foo::bar`, where `{{root}}` is a virtual keyword `keywords::CrateRoot`.

Also, fixes #38016.

r? @nrc
2016-12-23 06:22:45 +00:00
Jeffrey Seyfried
c12fc66a9d Allow legacy custom derive authors to disable warnings in downstream crates. 2016-12-23 05:49:34 +00:00
Jeffrey Seyfried
1187e2173c Fix rustdoc ICE. 2016-12-23 02:32:53 +00:00
Jeffrey Seyfried
31d9cc3833 Fix import resolution bug and fold all idents in the AST. 2016-12-23 02:16:31 +00:00
bors
a173778d1d Auto merge of #38330 - ollie27:rustdoc_short_summaries, r=steveklabnik
rustdoc: Fix short summaries in search results

They should be run through a Markdown renderer in rustdoc to remove
links.

This also fixes the mouse over text for the Crates list on the sidebar.

[before](https://doc.rust-lang.org/nightly/std/index.html?search=ord) [after](https://ollie27.github.io/rust_doc_test/std/index.html?search=ord)
2016-12-22 22:28:41 +00:00
Michael Woerister
6a51d37490 ICH: Add test case for InlineAsm hashes. 2016-12-22 14:27:53 -05:00
Ivan Molodetskikh
5e2cea9a4e
Cleaned up the code and added tests. 2016-12-22 14:54:42 +03:00
Jeffrey Seyfried
098c9b69e1 Fix non-termination in resolve. 2016-12-22 08:58:10 +00:00
Jeffrey Seyfried
8a1acb2c69 Pretty-print $crate::foo::bar as ::foo::bar. 2016-12-22 06:14:36 +00:00
Jeffrey Seyfried
f10f50b426 Refactor how global paths are represented (for both ast and hir). 2016-12-22 06:14:35 +00:00
Austin Hicks
b6b630a490 Disable field reordering 2016-12-21 19:57:29 -05:00
Michael Woerister
d0c61ebc12 ICH: Add test case for extern mods. 2016-12-21 17:16:28 -05:00
bors
1b38776c1f Auto merge of #38302 - Mark-Simulacrum:trans-cleanup, r=eddyb
Cleanup old trans

This is a cleanup of old trans, with the following main points:
 - Remove the `build.rs` API (prefer using `Builder` directly, which is now passed where needed through `BlockAndBuilder`).
 - Remove `Block` (inlining it into `BlockAndBuilder`)
 - Remove `Callee::call`, primarily through inlining and simplification of code.
 - Thinned `FunctionContext`:
   - `mir`, `debug_scopes`, `scopes`, and `fn_ty` are moved to `MirContext`.
   - `param_env` is moved to `SharedCrateContext` and renamed to `empty_param_env`.
   - `llretslotptr` is removed, replaced with more careful management of the return values in calls.
   - `landingpad_alloca` is inlined into cleanup.
   - `param_substs` are moved to `MirContext`.
   - `span` is removed, it was never set to anything but `None`.
   - `block_arena` and `lpad_arena` are removed, since neither was necessary (landing pads and block are quite small, and neither needs arena allocation).
 - Fixed `drop_in_place` not running other destructors in the same function.

Fixes #35566 (thanks to @est31 for confirming).
2016-12-21 10:38:22 +00:00
bors
439c3128d7 Auto merge of #38099 - GuillaumeGomez:cast_suggestions, r=nikomatsakis
Cast suggestions

r? @nikomatsakis
2016-12-21 07:28:16 +00:00
Mark Simulacrum
6e3d8cda2c Fix and cleanup callee shims 2016-12-20 20:03:34 -07:00
Michael Woerister
78f630f854 ICH: Add test cases for inherent impls. 2016-12-20 16:54:22 -05:00
Alex Crichton
0cf7d5dcae Merge branch 'rfc_1560_warning_cycle' of https://github.com/jseyfried/rust into rollup
Conflicts:
	src/librustc_resolve/lib.rs
	src/librustc_resolve/resolve_imports.rs
2016-12-20 13:00:16 -08:00
Alex Crichton
10bb4a4801 Rollup merge of #38486 - est31:master, r=petrochenkov
Add regression test for #38458

Closes #38458
2016-12-20 12:59:11 -08:00