Stream: t-compiler

Topic: repr(simd) struct([T; N])


gnzlbg (Aug 07 2019 at 16:25, on Zulip):

ok so eval_usize appears to be a bad idea

gnzlbg (Aug 07 2019 at 16:25, on Zulip):

for some reason, the layout of a struct S<const N: usize>([f32; N]); is required before monomorphization

gnzlbg (Aug 07 2019 at 16:25, on Zulip):

e.g.

gnzlbg (Aug 07 2019 at 16:25, on Zulip):

a simple lib crate that exposes a pub S fails

oli (Aug 07 2019 at 16:39, on Zulip):

Told you ^^

oli (Aug 07 2019 at 16:39, on Zulip):

Can you backtrace the place where it's invoked before llvm?

gnzlbg (Aug 07 2019 at 16:45, on Zulip):

i'm kind of coming forward with debugging

gnzlbg (Aug 07 2019 at 16:46, on Zulip):

i am going to make the simd_size function of sty take a ParamEnv environment

gnzlbg (Aug 07 2019 at 16:46, on Zulip):

and then do what is done for arrays in Layout

gnzlbg (Aug 07 2019 at 16:46, on Zulip):
if count.has_projections() {
    count = tcx.normalize_erasing_regions(param_env, count);
     if count.has_projections() {
         return Err(LayoutError::Unknown(ty));
     }
}

let count = count.try_eval_usize(tcx, param_env).ok_or(LayoutError::Unknown(ty))?;
oli (Aug 07 2019 at 16:49, on Zulip):

Yea that seems reasonable

gnzlbg (Aug 08 2019 at 11:33, on Zulip):

so i'm stuck now

gnzlbg (Aug 08 2019 at 11:33, on Zulip):

i solved a couple of issues

gnzlbg (Aug 08 2019 at 11:33, on Zulip):

but I'm getting incorrect code generation in a couple of places

gnzlbg (Aug 08 2019 at 11:34, on Zulip):

the llvm-ir that i'm seeing is completely wrong, i'm gonna have to learn how to read mir

gnzlbg (Aug 08 2019 at 11:36, on Zulip):

that's the mir for the following program(https://gist.github.com/gnzlbg/479ba3ee6a563af17e0d517de21e241b):

// run-pass
#![allow(non_camel_case_types, incomplete_features)]
#![feature(repr_simd, platform_intrinsics, const_generics)]

use std::ops;

#[repr(simd)]
#[derive(Copy, Clone)]
struct S<const N: usize>([f32; N]);


extern "platform-intrinsic" {
    fn simd_add<T>(x: T, y: T) -> T;
    fn simd_extract<T, E>(v: T, idx: u32) -> E;
}

fn add<T: ops::Add<Output=T>>(lhs: T, rhs: T) -> T {
    lhs + rhs
}

impl ops::Add for S<4> {
    type Output = Self;

    fn add(self, rhs: Self) -> Self {
        unsafe {simd_add(self, rhs)}
    }
}


pub fn main() { unsafe {
    let lr2 = S::<4>([1.0f32, 2.0f32, 3.0f32, 4.0f32]);
    let a = add(lr2, lr2);
    let x: f32 = simd_extract(a, 0);
    let y: f32 = simd_extract(a, 1);
    assert_eq!(x, 2.0f32);
    assert_eq!(y, 4.0f32);
}}
gnzlbg (Aug 08 2019 at 11:43, on Zulip):

@oli can it be that this is being evaluated at compile-time ?

gnzlbg (Aug 08 2019 at 11:43, on Zulip):

and constant evaluation is doing something wrong ?

gnzlbg (Aug 08 2019 at 11:55, on Zulip):

For this:

#[inline(never)]
pub fn add_pair() -> (f32, f32) { unsafe {
    let lr2 = S::<4>([1.0f32, 2.0f32, 3.0f32, 4.0f32]);
    let a = simd_add(lr2, lr2);
    let x: f32 = simd_extract(a, 0);
    let y: f32 = simd_extract(a, 1);
    (x, y)
}}

I get the following MIR:

fn  add_pair() -> (f32, f32) {
    let mut _0: (f32, f32);              // return place in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:18:22: 18:32
    let mut _2: [f32; 4];                // in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:22: 19:54
    let mut _4: S<4usize>;               // in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:22: 20:25
    let mut _5: S<4usize>;               // in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:27: 20:30
    let mut _7: S<4usize>;               // in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:31: 21:32
    let mut _9: S<4usize>;               // in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:22:31: 22:32
    let mut _10: f32;                    // in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:23:6: 23:7
    let mut _11: f32;                    // in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:23:9: 23:10
    scope 1 {
        let _1: S<4usize>;               // "lr2" in scope 1 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:9: 19:12
        scope 2 {
            let _3: S<4usize>;           // "a" in scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:9: 20:10
            scope 3 {
                let _6: f32 as UserTypeProjection { base: UserType(1), projs: [] }; // "x" in scope 3 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:9: 21:10
                scope 4 {
                    let _8: f32 as UserTypeProjection { base: UserType(3), projs: [] }; // "y" in scope 4 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:22:9: 22:10
                    scope 5 {
                    }
                }
            }
        }
    }

    bb0: {
        StorageLive(_1);                 // bb0[0]: scope 1 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:9: 19:12
        StorageLive(_2);                 // bb0[1]: scope 1 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:22: 19:54
        _2 = [const 1f32, const 2f32, const 3f32, const 4f32]; // bb0[2]: scope 1 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:22: 19:54
                                         // ty::Const
                                         // + ty: f32
                                         // + val: Scalar(0x3f800000)
                                         // mir::Constant
                                         // + span: /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:23: 19:29
                                         // + ty: f32
                                         // + literal: Const { ty: f32, val: Scalar(0x3f800000) }
                                         // ty::Const
                                         // + ty: f32
                                         // + val: Scalar(0x40000000)
                                         // mir::Constant
                                         // + span: /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:31: 19:37
                                         // + ty: f32
                                         // + literal: Const { ty: f32, val: Scalar(0x40000000) }
                                         // ty::Const
                                         // + ty: f32
                                         // + val: Scalar(0x40400000)
                                         // mir::Constant
                                         // + span: /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:39: 19:45
                                         // + ty: f32
                                         // + literal: Const { ty: f32, val: Scalar(0x40400000) }
                                         // ty::Const
                                         // + ty: f32
                                         // + val: Scalar(0x40800000)
                                         // mir::Constant
                                         // + span: /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:47: 19:53
                                         // + ty: f32
                                         // + literal: Const { ty: f32, val: Scalar(0x40800000) }
        (_1.0: [f32; 4]) = move _2;      // bb0[3]: scope 1 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:15: 19:55
        StorageDead(_2);                 // bb0[4]: scope 1 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:19:54: 19:55
        StorageLive(_3);                 // bb0[5]: scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:9: 20:10
        StorageLive(_4);                 // bb0[6]: scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:22: 20:25
        _4 = _1;                         // bb0[7]: scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:22: 20:25
        StorageLive(_5);                 // bb0[8]: scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:27: 20:30
        _5 = _1;                         // bb0[9]: scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:27: 20:30
        _3 = const simd_add::<S<4usize>>(move _4, move _5) -> bb1; // bb0[10]: scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:13: 20:31
                                         // ty::Const
                                         // + ty: unsafe extern "platform-intrinsic" fn(S<4usize>, S<4usize>) -> S<4usize> {simd_add::<S<4usize>>}
                                         // + val: Scalar(<ZST>)
                                         // mir::Constant
                                         // + span: /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:13: 20:21
                                         // + ty: unsafe extern "platform-intrinsic" fn(S<4usize>, S<4usize>) -> S<4usize> {simd_add::<S<4usize>>}
                                         // + literal: Const { ty: unsafe extern "platform-intrinsic" fn(S<4usize>, S<4usize>) -> S<4usize> {simd_add::<S<4usize>>}, val: Scalar(<ZST>) }
    }

    bb1: {
        StorageDead(_5);                 // bb1[0]: scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:30: 20:31
        StorageDead(_4);                 // bb1[1]: scope 2 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:20:30: 20:31
        StorageLive(_6);                 // bb1[2]: scope 3 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:9: 21:10
        StorageLive(_7);                 // bb1[3]: scope 3 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:31: 21:32
        _7 = _3;                         // bb1[4]: scope 3 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:31: 21:32
        _6 = const simd_extract::<S<4usize>, f32>(move _7, const 0u32) -> bb2; // bb1[5]: scope 3 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:18: 21:36
                                         // ty::Const
                                         // + ty: unsafe extern "platform-intrinsic" fn(S<4usize>, u32) -> f32 {simd_extract::<S<4usize>, f32>}
                                         // + val: Scalar(<ZST>)
                                         // mir::Constant
                                         // + span: /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:18: 21:30
                                         // + ty: unsafe extern "platform-intrinsic" fn(S<4usize>, u32) -> f32 {simd_extract::<S<4usize>, f32>}
                                         // + literal: Const { ty: unsafe extern "platform-intrinsic" fn(S<4usize>, u32) -> f32 {simd_extract::<S<4usize>, f32>}, val: Scalar(<ZST>) }
                                         // ty::Const
                                         // + ty: u32
                                         // + val: Scalar(0x00000000)
                                         // mir::Constant
                                         // + span: /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:34: 21:35
                                         // + ty: u32
                                         // + literal: Const { ty: u32, val: Scalar(0x00000000) }
    }

    bb2: {
        StorageDead(_7);                 // bb2[0]: scope 3 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:21:35: 21:36
        StorageLive(_8);                 // bb2[1]: scope 4 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:22:9: 22:10
        StorageLive(_9);                 // bb2[2]: scope 4 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:22:31: 22:32
        _9 = _3;                         // bb2[3]: scope 4 at /Users/gnzlbg/projects/sideprojects/rust/src/test/ui/simd/simd-generics.rs:22:31: 22:32
        _8 = const simd_extract::<S<4usize>, f32>(move _9, const 1u32) -> bb3; // bb2[4]: scope 4 a
gnzlbg (Aug 08 2019 at 11:57, on Zulip):

What's this const simd_add::<S<4usize>>(move _4, move _5) doing ? Is this evaluating simd_add at compile-time ?

gnzlbg (Aug 08 2019 at 11:59, on Zulip):

cc @eddyb ^^^

gnzlbg (Aug 08 2019 at 12:10, on Zulip):

Smaller example, the LLVM-IR generated is broken: https://gcc.godbolt.org/z/3nuRpN

Wesley Wiser (Aug 08 2019 at 12:10, on Zulip):

What's this const simd_add::<S<4usize>>(move _4, move _5) doing ? Is this evaluating simd_add at compile-time ?

I'm pretty sure that just means that function call is to a ConstVal which is a function pointer to simd_add. It doesn't mean that the function call is being evaluated at compile time.

gnzlbg (Aug 08 2019 at 12:13, on Zulip):

found the issue, the memcpy in the last example is incorrect

gnzlbg (Aug 08 2019 at 12:14, on Zulip):

for some reason, when copying the [f32; 4] array into the <4 x f32> vector, we call memcpy with a length of 4 bytes

gnzlbg (Aug 08 2019 at 12:14, on Zulip):
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 16 %5, i8* align 4 %6, i64 4, i1 false)
gnzlbg (Aug 08 2019 at 12:14, on Zulip):

we should call it with a length of 4 * sizeof(f32) == 16 bytes

gnzlbg (Aug 08 2019 at 12:15, on Zulip):

that fixes the bug, now the question is, where is this coming from

oli (Aug 08 2019 at 13:12, on Zulip):

there's no memcpy in the MIR, is it being generated by the simd intrinsics?

gnzlbg (Aug 08 2019 at 13:27, on Zulip):

@oli no, it is maybe generated by move in the mir ?

gnzlbg (Aug 08 2019 at 13:27, on Zulip):
pub fn build_array(x: [f32; 4]) -> S<4> {  S::<4>(x) }
gnzlbg (Aug 08 2019 at 13:28, on Zulip):

that reproduces the issue

gnzlbg (Aug 08 2019 at 13:28, on Zulip):

the layout of a simd type is not queried anywhere

gnzlbg (Aug 08 2019 at 13:28, on Zulip):

when generating this incorrect LLVM-IR for it:

define void @build_array(<4 x float>* noalias nocapture sret dereferenceable(16), [4 x float]* noalias nocapture dereferenceable(16) %x) unnamed_addr #0 {
start:
  %_2 = alloca [4 x float], align 4
  %1 = bitcast [4 x float]* %_2 to i8*
  call void @llvm.lifetime.start.p0i8(i64 16, i8* %1)
  %2 = bitcast [4 x float]* %_2 to i8*
  %3 = bitcast [4 x float]* %x to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %2, i8* align 4 %3, i64 16, i1 false)
  %4 = bitcast <4 x float>* %0 to float*
  %5 = bitcast float* %4 to i8*
  %6 = bitcast [4 x float]* %_2 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 16 %5, i8* align 4 %6, i64 4, i1 false)
  %7 = bitcast [4 x float]* %_2 to i8*
  call void @llvm.lifetime.end.p0i8(i64 16, i8* %7)
  ret void
}
gnzlbg (Aug 08 2019 at 13:29, on Zulip):

(note that the second memcpy only copies 4 bytes into the vector, for whatever reason

gnzlbg (Aug 08 2019 at 13:32, on Zulip):

the mir i get is

fn  build_array(_1: [f32; 4]) -> S<4usize> {
    let mut _0: S<4usize>;               // return place in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/codegen/simd-intrinsic/simd-intrinsic-transmute-array.rs:14:36: 14:40
    let mut _2: [f32; 4];                // in scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/codegen/simd-intrinsic/simd-intrinsic-transmute-array.rs:16:12: 16:13

    bb0: {
        StorageLive(_2);                 // bb0[0]: scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/codegen/simd-intrinsic/simd-intrinsic-transmute-array.rs:16:12: 16:13
        _2 = _1;                         // bb0[1]: scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/codegen/simd-intrinsic/simd-intrinsic-transmute-array.rs:16:12: 16:13
        (_0.0: [f32; 4]) = move _2;      // bb0[2]: scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/codegen/simd-intrinsic/simd-intrinsic-transmute-array.rs:16:5: 16:14
        StorageDead(_2);                 // bb0[3]: scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/codegen/simd-intrinsic/simd-intrinsic-transmute-array.rs:16:13: 16:14
        return;                          // bb0[4]: scope 0 at /Users/gnzlbg/projects/sideprojects/rust/src/test/codegen/simd-intrinsic/simd-intrinsic-transmute-array.rs:17:2: 17:2
    }
}
gnzlbg (Aug 08 2019 at 13:32, on Zulip):

So (_0.0: [f32; 4]) = move _2; is what I'd say is geenrating the memcpy

gnzlbg (Aug 08 2019 at 13:35, on Zulip):

What I don't understand is why the layout of S<4usize> is not computed anywhere

gnzlbg (Aug 08 2019 at 13:36, on Zulip):

I'd suppose that to lower that to LLVM one must compute the layout of S

oli (Aug 08 2019 at 13:38, on Zulip):

how are you checking that it's not being computed?

gnzlbg (Aug 08 2019 at 13:41, on Zulip):

I have an eprintln! in the Layout pattern for it

gnzlbg (Aug 08 2019 at 13:42, on Zulip):

when building other code it shows, when building this particular test it does not

gnzlbg (Aug 08 2019 at 13:42, on Zulip):

maybe instead of fetching the layout, some code is calling simd_size and simd_type and doing some manual layout computation for this ?

gnzlbg (Aug 08 2019 at 13:43, on Zulip):

I'm trying to figure out where the memcpys are generated so that i can try to trace things back

gnzlbg (Aug 08 2019 at 13:43, on Zulip):

but grepping for "move" isn't super helpful

oli (Aug 08 2019 at 13:43, on Zulip):

well... Operand::Move

gnzlbg (Aug 08 2019 at 16:24, on Zulip):

ok, so the layout was called

gnzlbg (Aug 08 2019 at 16:24, on Zulip):

compiletest without --verbose swallows eprintln statements from rustc, and logs

gnzlbg (Aug 08 2019 at 16:25, on Zulip):

the layout returned is correct

gnzlbg (Aug 08 2019 at 16:26, on Zulip):

I have no idea where the incorrect memcpy size might be coming from, or how to debug it, Operand::Move is quite abstract

gnzlbg (Aug 08 2019 at 16:30, on Zulip):

the code in rustc_codegen_ssa does the obvious thing for Operand::Move

oli (Aug 08 2019 at 17:46, on Zulip):

have you checked whether std::mem::type_size still returns the right thing for your type?

gnzlbg (Aug 08 2019 at 18:25, on Zulip):

checking that right now

gnzlbg (Aug 08 2019 at 18:25, on Zulip):

i published the branch here: https://github.com/gnzlbg/rust/tree/array_simd

Last update: Nov 16 2019 at 01:45UTC