Stream: general

Topic: miri question on raw pointer access for alignment


Brian Campbell (Mar 27 2019 at 18:30, on Zulip):

Given the recent release of miri via rustup, I figured I'd try it out on some popular crates to see if I found any issues.

Ran it on regex, and after making some changes to skip quickcheck, use non-randomized hashes, and avoid inline ASM and simd in memchr, ran into this. I'm not sure if this indicates actual UB, or just something that miri doesn't support; and if it's something that miri doesn't support, is there a supported way of doing what it's trying to do to find the first USIZE aligned pointer within the haystack:

error[E0080]: constant evaluation error: a raw memory access tried to access part of a pointer value as raw bytes
  --> /home/lambda/.cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.2.0/src/fallback.rs:64:42
   |
64 |         ptr = ptr_add(ptr, USIZE_BYTES - (start_ptr as usize & align));
   |                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ a raw memory access tried to access part of a pointer value as raw bytes
   |
   = note: inside call to `memchr::fallback::memchr` at /home/lambda/.cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.2.0/src/lib.rs:153:9
   = note: inside call to `memchr::memchr::imp` at /home/lambda/.cargo/registry/src/github.com-1ecc6299db9ec823/memchr-2.2.0/src/lib.rs:159:9
Jake Goulding (Mar 27 2019 at 23:06, on Zulip):

/cc @RalfJ ^

Jake Goulding (Mar 27 2019 at 23:07, on Zulip):

The text sounds like an actual error, but the code it points to (based on my own knowledge / guesses) seems off

Jake Goulding (Mar 27 2019 at 23:12, on Zulip):

Like, the highlighted expression is not accessing as raw bytes. It may just be poorly worded for me though.

Jake Goulding (Mar 27 2019 at 23:14, on Zulip):

Trying to make a related example:

fn main() {
    let a: u8 = 42;
    let a_ref = &a;
    let a_ptr = a_ref as *const u8;
    let a_aligned = a_ptr as usize % 16;
    unsafe { a_ptr.sub(a_aligned) };
}
error[E0080]: constant evaluation error: a raw memory access tried to access part of a pointer value as raw bytes
 --> src/main.rs:5:21
  |
5 |     let a_aligned = a_ptr as usize % 16;
  |                     ^^^^^^^^^^^^^^^^^^^ a raw memory access tried to access part of a pointer value as raw bytes
  |
  = note: inside call to `main` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:64:34
  = note: inside call to closure at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:52:53
  = note: inside call to closure at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:293:40
  = note: inside call to `std::panicking::try::do_call::<[closure@DefId(1/1:1830 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panicking.rs:289:5
  = note: inside call to `std::panicking::try::<i32, [closure@DefId(1/1:1830 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe]>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/panic.rs:388:9
  = note: inside call to `std::panic::catch_unwind::<[closure@DefId(1/1:1830 ~ std[82ff]::rt[0]::lang_start_internal[0]::{{closure}}[0]) 0:&dyn std::ops::Fn() -> i32 + std::marker::Sync + std::panic::RefUnwindSafe], i32>` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:52:25
  = note: inside call to `std::rt::lang_start_internal` at /root/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libstd/rt.rs:64:5
  = note: inside call to `std::rt::lang_start::<()>`
Jake Goulding (Mar 27 2019 at 23:14, on Zulip):

Now I'm curious, since I do the exact same thing in Jetscii (and for the same reason as well — to use the PCMPSTRx SIMD instrinsics)

Jake Goulding (Mar 27 2019 at 23:21, on Zulip):

@RalfJ we gotta get Miri to give some help: maybe you want to do X instead :-)

oli (Mar 28 2019 at 08:06, on Zulip):

what you usually want to do is to use https://doc.rust-lang.org/std/primitive.slice.html#method.align_to since that uses intrinsics internally to support all the use cases of miri. The low level operation for pointers is https://doc.rust-lang.org/std/primitive.pointer.html#method.align_offset

RalfJ (Mar 28 2019 at 14:05, on Zulip):

yeah, this is because Miri doesnt pick an actual base address for an allocation

RalfJ (Mar 28 2019 at 14:05, on Zulip):

so if the code does anything that depends on what the base address would be, you get an error

RalfJ (Mar 28 2019 at 14:05, on Zulip):

we have a long-standing open issue to better classify errors into "UB found" and "hit engine limitation"...

Brian Campbell (Mar 28 2019 at 17:24, on Zulip):

OK, so that would be an instance of https://github.com/rust-lang/miri/issues/417, and it looks like the right way forward for memchr would be to switch to slice::align_to. Thanks for the info!

Brian Campbell (Mar 28 2019 at 17:26, on Zulip):

A message about maybe you want to do X instead like the compiler gives would definitely be helpful, in addition to distinguishing UB from unsupported operations. Not sure if I'll have time to, but if I do have some cycles to look at contributing to miri, I might try to work on one of those.

oli (Mar 28 2019 at 17:29, on Zulip):

There's also https://github.com/rust-rfcs/const-eval/issues/4 tracking the enum split

Jake Goulding (Mar 30 2019 at 15:25, on Zulip):

This is a nightly-only experimental API.

Well, that's kind of bunk, now innit.

Last update: Nov 21 2019 at 23:25UTC