Stream: general

Topic: exercising alignment


Jake Goulding (Jun 28 2019 at 13:57, on Zulip):

I have a type that probably needs to specify #[repr(align(…))], based on a bug report I got. This type is inside another type. How can I write some code that pushes this type around in memory to attempt to reproduce the error?

Jake Goulding (Jun 28 2019 at 13:59, on Zulip):
struct XxCore {
    v1: u64,
    v2: u64,
    v3: u64,
    v4: u64,
}

pub struct XxHash64 {
    total_len: u64,
    seed: u64,
    core: XxCore,
    buffer: Buffer,
}

struct Buffer {
    data: [u8; CHUNK_SIZE],
    len: usize,
}
Jake Goulding (Jun 28 2019 at 13:59, on Zulip):

Namely, I’m pretty sure that Buffer needs to be aligned for a u64

Jake Goulding (Jun 28 2019 at 14:00, on Zulip):

on a 32-bit machine, the usize is a u32 and I don’t get it “for free”

Jake Goulding (Jun 28 2019 at 14:02, on Zulip):

When testing similar things, I’ve just allocated a big buffer of u8 and added byte-by-byte offsets. Can I do a similar thing here? Except I need to control where a field is inside of another type, and XxHash64 is presumably already aligned to a bigger value.

RalfJ (Jun 28 2019 at 14:26, on Zulip):

I don't understand what you are trying to do. Do you want to test the alignment by casting the pointer to int at run-time and doing % 8? What are "byte-by-byte offsets"?

You can run it in Miri though, it will check alignment on every access... ;)

Jake Goulding (Jun 28 2019 at 14:36, on Zulip):

the problem is that I treat Buffer::data as a &[u64] and assume that there will never be unaligned data at the beginning. This should be fine once I add repr[align(…)] to Buffer, however it hasn’t tripped up any of my test assertions yet.

Jake Goulding (Jun 28 2019 at 14:36, on Zulip):

I want to write a test

Jake Goulding (Jun 28 2019 at 14:37, on Zulip):

I have similar tests for &[u8], where I make up a random Vec<u8> and a usize and index into it, producing a variety of starting offsets

Jake Goulding (Jun 28 2019 at 14:38, on Zulip):

My “ideal” case would be to somehow perturbate the layout created for XxHash so that the buffer field sometimes is placed on a non-64-bit point.

Jake Goulding (Jun 28 2019 at 14:41, on Zulip):

I think my practical workaround will be to make a Vec<u8> that is much larger than Buffer, then ptr::write a Buffer into the vec at multiples of align_of::Buffer, ensuring my debug assertions don’t get tripped

nagisa (Jun 28 2019 at 14:48, on Zulip):

#[repr(Rust)] will reorder fields to ensure that there is as little padding left inside as possible

nagisa (Jun 28 2019 at 14:49, on Zulip):

but the alignment of XxCore will still stay 64-bits in this case regardless, so you cannot just place it on non-64-bit boundary

nagisa (Jun 28 2019 at 14:50, on Zulip):

/me decides they don’t get what the intention here is either

Jake Goulding (Jun 28 2019 at 14:53, on Zulip):

I apologize for my poor description. This seems “obvious” to me so I’m having trouble figuring how to say it differently

Jake Goulding (Jun 28 2019 at 14:54, on Zulip):

This assertion is being hit by a user of my crate who is running on 32-bit x86

Jake Goulding (Jun 28 2019 at 14:55, on Zulip):

which makes perfect sense: the Buffer struct will be aligned to 32-bit (due to the usize), which means it might not be aligned to 64-bit

Jake Goulding (Jun 28 2019 at 14:56, on Zulip):

I want to write a test that will fail (cause the assertion to be triggered)

Jake Goulding (Jun 28 2019 at 14:58, on Zulip):

So that I can see the assertion failure and then see it go away when I add the right repr(align)

Jake Goulding (Jun 28 2019 at 14:59, on Zulip):

My current attempt is something like

    #[test]
    fn x() {
        use std::{mem, ptr};
        use super::{Buffer, CHUNK_SIZE};


        let buffer_size = mem::size_of::<Buffer>();
        let buffer_align = mem::align_of::<Buffer>();

    // Make it big enough that we don't need to worry
        let mut scratch: Vec<u8> = std::vec![0; buffer_size * 10];

        let alignment_steps = buffer_size / buffer_align;

        std::dbg!(buffer_size, buffer_align);
        for step in 0..=alignment_steps {
            // Buffer must be plain data, no `Drop` implementation
            unsafe {
                let start = scratch.as_mut_ptr().add(step * buffer_align);
        let start = start as *mut Buffer;
                ptr::write(start, Buffer::default());

                let buf = &mut *start;
                buf.len = CHUNK_SIZE;

                // Exercise assertions in method
                let _ = buf.as_u64_arrays();
            }
        }
        panic!();
    }
nagisa (Jun 28 2019 at 15:44, on Zulip):

@Jake Goulding the only thing you can do is to compare align_of .

nagisa (Jun 28 2019 at 15:44, on Zulip):

I think.

Jake Goulding (Jun 28 2019 at 15:46, on Zulip):

@nagisa you mean assert_eq(align_of::<u64>(), align_of::<Buffer>())?

nagisa (Jun 28 2019 at 15:47, on Zulip):

Yeah.

nagisa (Jun 28 2019 at 15:47, on Zulip):

fwiw you need your align(64) to be on Buffer::data, not the Buffer itself

Jake Goulding (Jun 28 2019 at 15:48, on Zulip):

you can put it on members? All the docs imply it needs to go on types

nagisa (Jun 28 2019 at 15:48, on Zulip):

You’ll need to wrap your array into another struct.

Jake Goulding (Jun 28 2019 at 15:49, on Zulip):

align(64) or align(8)? I haven’t yet found a definitive unit listed anywhere

nagisa (Jun 28 2019 at 15:49, on Zulip):

don’t remember either.

nagisa (Jun 28 2019 at 15:49, on Zulip):

if you have multiple fields in your struct, those fields can still be re-ordered, reducing the effective alignment of your array to a lower value

nagisa (Jun 28 2019 at 15:50, on Zulip):

Another good test you can do is to assert that your field has offset that matches your alignment requirements

Jake Goulding (Jun 28 2019 at 15:55, on Zulip):

The trouble I’m having with the last one is that ultimately I can’t test the offset effectively because the compiler can do whatever it wants

Jake Goulding (Jun 28 2019 at 15:56, on Zulip):

For example, in the “real” case, there’s a huge stack of types: hashbrown::HashMap<String, InnerStmt, BuildHasherDefault<XxHash64>>

Jake Goulding (Jun 28 2019 at 15:57, on Zulip):

And I think that’s the one and only use of it in the program, so the compiler can do whatever fun optimization it wants to

RalfJ (Jun 28 2019 at 16:20, on Zulip):

trying to rephrase what you are doing with your "testing at offsets", you basically have a closure fn test_me(x: &mut Buffer) that you want to call with Buffer being as unaligned as possible?

RalfJ (Jun 28 2019 at 16:21, on Zulip):

or maybe more like fn test_me(place_for_buf: *mut Buffer) because you need to initialize yourself or so

RalfJ (Jun 28 2019 at 16:23, on Zulip):

that sounds reasonable, if you get a memory area of size size_of<Buffer>() + 32 and then try all offsets less than 32 such that that you actually get a sufficiently aligned pointer, you should cover all your cases

RalfJ (Jun 28 2019 at 16:23, on Zulip):

why would that not find such a problem? of course this only works if you can control allocation

Jake Goulding (Jun 28 2019 at 19:37, on Zulip):

Yes, I think it’s accurate to say “I want to test placing my Buffer at every alignment it is valid for”, but as @nagisa points out, that’s not even enough.

Jake Goulding (Jun 28 2019 at 19:38, on Zulip):

Because what I really need to ensure is that a specific field of my Buffer will always be aligned correctly; I don’t care about the alignment of Buffer itself.

nagisa (Jun 28 2019 at 19:39, on Zulip):

@Jake Goulding yeah, that involves just checking that align_of::<Buffer>() >= whatyouwant and field_offset!(Buffer, data) % WHATYOUWANT == 0.

Jake Goulding (Jun 28 2019 at 19:40, on Zulip):

The problem is that there’s no test I can write that says “for every possible way that repr(Rust) might organize this struct”

nagisa (Jun 28 2019 at 19:40, on Zulip):

(field_offset! is not a built-in, you’ll need to figure out the incantation yourself or find a crate that does this)

Jake Goulding (Jun 28 2019 at 19:40, on Zulip):

@nagisa this is where @RalfJ tells me that no implementation of offsetof is non-UB because we can’t get raw pointers to fields without making references :wink:

nagisa (Jun 28 2019 at 19:40, on Zulip):

well, yeah, you cannot do that.

nagisa (Jun 28 2019 at 19:41, on Zulip):

@Jake Goulding well… I don’t see how let x = Buffer { ... }; let fieldptr = &x.data as *const _ is gonna be bad.

nagisa (Jun 28 2019 at 19:42, on Zulip):

that is, you don’t really need offsetof, you just need to get an offset of a field in some instantiation of type, which should be fine.

Jake Goulding (Jun 28 2019 at 19:42, on Zulip):

@nagisa which means I cannot directly write a test that triggers the same problem as my user; I can only test indirect things. At this point, wrapping it in a #[repr(align(8)] struct AlignToU64<T>(T) is equally as descriptive as this second-class of test.

Jake Goulding (Jun 28 2019 at 19:44, on Zulip):

That’s what really annoys me here; I cannot reproduce their bug, even though I have really strong suspicions.

RalfJ (Jun 28 2019 at 20:03, on Zulip):

The problem is that there’s no test I can write that says “for every possible way that repr(Rust) might organize this struct”

yeah that you cannot do :/

RalfJ (Jun 28 2019 at 20:03, on Zulip):

and Miri cannot, either

RalfJ (Jun 28 2019 at 20:04, on Zulip):

though given that your user ran into this, it seems like Rust actually does lay out out to your disadvantage, so running miri with --target <32bit target> should run into the problem?

nagisa (Jun 28 2019 at 20:07, on Zulip):

(We wanted to introduce randomization of field ordering as an option IIRC to weed out assumptions about order fields are laid out)

Jake Goulding (Jun 28 2019 at 20:09, on Zulip):

Is it likely that Linux and Windows would lay out a struct differently (both 32-bit x86 and the same version of the compiler)

Jake Goulding (Jun 28 2019 at 20:10, on Zulip):

They see this in multiple x86 Windows builds, but I cant see it in Linux

nagisa (Jun 28 2019 at 20:30, on Zulip):

It is possible, Windows' ABI mandates different alignments and stuff for various things.

RalfJ (Jun 28 2019 at 20:31, on Zulip):

ah dang... cross-miri is a thing on my list but currently it is hard due to https://github.com/rust-lang/rust/issues/56443

RalfJ (Jun 28 2019 at 20:32, on Zulip):

in principle miri is entirely cross-capable, there's just some silly build system stuff in the way

RalfJ (Jun 28 2019 at 20:32, on Zulip):

for the libstd

RalfJ (Jun 28 2019 at 20:32, on Zulip):

it insists on building an artifact we dont want nor need, and needs a target toolchain just for that :(

Jake Goulding (Jun 29 2019 at 13:13, on Zulip):

For those interested, I can reproduce, but so far only by running on a 32-bit Windows machine, running their full test suite which requires MySQL

Last update: Nov 21 2019 at 23:40UTC