Stream: t-lang/wg-unsafe-code-guidelines

Topic: Miri: supporting more code vs detecting more bugs


RalfJ (Jun 28 2019 at 07:10, on Zulip):

We hit a spot in Miri where we have the choice between either supporting more (real) code or finding more (real) bugs. My plan is to offer both modes with a flag, but realistically most people are going to use the default mode, so what should that be?

The following program is fine:

fn main() {
    let x = &mut [0u8; 3];
    let base_addr = x as *mut _ as usize;
    // Make sure we cast an even address to &mut u16
    let u16_ref = unsafe { if base_addr % 2 == 0 {
        &mut *(base_addr as *mut u16)
    } else {
        &mut *((base_addr+1) as *mut u16)
    } };
    *u16_ref = 2;
    println!("{:?}", x);
}

But the following program is UB if you have "bad luck" and the alignment is not what you guessed:

fn main() {
    let x = &mut [0u8; 3];
    let base_addr = x as *mut _ as usize;
    // Just hope that base_addr is even.
    let u16_ref = unsafe { &mut *(base_addr as *mut u16) };
    *u16_ref = 2;
    println!("{:?}", x);
}

The problem is, Miri cannot really know if the user did their job and checked alignment before casting, or if they were just lucky.

What Miri can do is disallow exploiting "extra alignment" that arises just because of where the base addresses have been. This is weird, but the upshot is that Miri can have two modes, where the "support more code mode" will always accept the first program and sometimes accept the second program; and the "find more bugs mode" will always reject both programs.

Which mode, do you think, should be the default? Cc @Jake Goulding @oli @centril @gnzlbg and anyone else

gnzlbg (Jun 28 2019 at 07:14, on Zulip):

miri is a dynamic checker, and every dynamic checker has this problem

RalfJ (Jun 28 2019 at 07:14, on Zulip):

Sure, but how does that help answer this question?

gnzlbg (Jun 28 2019 at 07:15, on Zulip):

All other dynamic checkers accept the code if the access is correct, and reject it if it isn't (e.g. because the alignment doesn't hold)

gnzlbg (Jun 28 2019 at 07:15, on Zulip):

running the checker twice, can accept the code the first run, and reject it the second run

gnzlbg (Jun 28 2019 at 07:15, on Zulip):

(that can happen under ASan, for example)

gnzlbg (Jun 28 2019 at 07:16, on Zulip):

For the defaults, we have to balance having false positives vs not catching all potential bugs

gnzlbg (Jun 28 2019 at 07:16, on Zulip):

or at least not catching them in one single run

gnzlbg (Jun 28 2019 at 07:17, on Zulip):

I prefer having zero false positives, so that if miri says you have a bug, then you are always sure you have a bug

gnzlbg (Jun 28 2019 at 07:18, on Zulip):

The option with adding a flag that's more "strict" but might come with false positives sounds good as an extension

gnzlbg (Jun 28 2019 at 07:18, on Zulip):

A different extension would be for miri to randomize address of memory allocations (of both the stack and the heap)

gnzlbg (Jun 28 2019 at 07:19, on Zulip):

so that you can "fuzz" your program with miri, getting a different memory address each time, hopefully revealing the bug

gnzlbg (Jun 28 2019 at 07:20, on Zulip):

e.g. in the heap miri could give you random access, and for the stack, it could insert random padding between allocations

RalfJ (Jun 28 2019 at 07:51, on Zulip):

I prefer having zero false positives, so that if miri says you have a bug, then you are always sure you have a bug

the message would not say "this is a bug", it would say "this operation is not supported in your current config, enable this flag to support it but notice that that might introduce false negatives"

RalfJ (Jun 28 2019 at 07:52, on Zulip):

If we only had to consider miri being run locally I'd lean fairly strongly towards "default mode should avoid false negatives", because we can always tell people to enable more flags to avoid false positives -- but not the other way around

RalfJ (Jun 28 2019 at 07:52, on Zulip):

however, that doesn't work in playground...

RalfJ (Jun 28 2019 at 07:52, on Zulip):

A different extension would be for miri to randomize address of memory allocations (of both the stack and the heap)

oh it will totally do that

RalfJ (Jun 28 2019 at 07:53, on Zulip):

also miri doesnt do "stack" allocation. all allocations are treated equal.

centril (Jun 28 2019 at 15:59, on Zulip):

I would favor detecting more code with the appropriate message

centril (Jun 28 2019 at 15:59, on Zulip):

or at least emit a warning

centril (Jun 28 2019 at 15:59, on Zulip):

supporting more code with a warning is also a reasonable compromise imo

centril (Jun 28 2019 at 16:00, on Zulip):

the point of miri being not to blindly rubber stamp but to make you think more

Shnatsel (Jun 28 2019 at 16:14, on Zulip):

I would go for MIRI never producing false positives. That's what sets ASAN apart from the rest: if it complains, you know for sure you have a problem. Most of its alternatives (Valgrind, DUMA) have false positives and that makes them unusable in projects where your dependencies that are outside your direct control trigger false positives.

Shnatsel (Jun 28 2019 at 16:15, on Zulip):

Treating that as a warning by default sounds like a good idea

RalfJ (Jun 28 2019 at 16:40, on Zulip):

supporting more code with a warning is also a reasonable compromise imo

we don't have a good way to warn during execution. maybe we should, but I am not sure what that would look like.

RalfJ (Jun 28 2019 at 16:41, on Zulip):

maybe we can even re-use the existing linting stuff for that?^^

RalfJ (Jun 28 2019 at 16:41, on Zulip):

in general though, that output gets mixed with the program output, which seems messy

RalfJ (Jun 28 2019 at 16:41, on Zulip):

and the warning would be one that lots of false positives, that's the entire problem

Shnatsel (Jun 28 2019 at 16:41, on Zulip):

Just put it on stderr and prefix it with [MIRI] and you're good as far as un-mixing it goes

RalfJ (Jun 28 2019 at 16:42, on Zulip):

not sure if we can make the rustc diagnostic printing stuff do that

RalfJ (Jun 28 2019 at 16:42, on Zulip):

(which would let us point at spans etc for free)

Shnatsel (Jun 28 2019 at 16:42, on Zulip):

In that case print only once that miri has more to say and provide a key to print the entire thing

RalfJ (Jun 28 2019 at 16:43, on Zulip):

but anyway this is useful feedback, thanks! unfortunately it's also conflicting feedback. ;)

Shnatsel (Jun 28 2019 at 16:44, on Zulip):

I think the information about the situation presented to the user can be summarized as "MIRI has detected behavior that might depend on memory addresses chosen at runtime. You may need to re-run it several times to detect an error." which is a single print and should only be done once

RalfJ (Jun 28 2019 at 16:47, on Zulip):

so you think pointing to a span (where in the code this happened) is not useful?

centril (Jun 28 2019 at 16:48, on Zulip):

@RalfJ struct_span_warn instead of struct_span_err basically

Shnatsel (Jun 28 2019 at 16:48, on Zulip):

Pointing to a span is useful, definitely.

RalfJ (Jun 28 2019 at 16:49, on Zulip):

I think the information about the situation presented to the user can be summarized as "MIRI has detected behavior that might depend on memory addresses chosen at runtime. You may need to re-run it several times to detect an error." which is a single print and should only be done once

that's a very nice way to frame this. not sure how to apply this to other things one might want to downgrade to a warning though, such as validation failures -- those are definitely UB but downgrading can allow execution to continue

centril (Jun 28 2019 at 16:49, on Zulip):

The rustc diagnostics framework handles errors and warnings very symmetrically

RalfJ (Jun 28 2019 at 16:53, on Zulip):

I guess from everything I know about @centril and @Shnatsel I shouldnt be surprised that I got quick, definite and contradicting answers to which mode (fewer false positives or fewer false negatives) should be the default. ;)

centril (Jun 28 2019 at 16:55, on Zulip):

Warnings with weasel words sounds like a way to breach the impasse ^^

centril (Jun 28 2019 at 16:56, on Zulip):

lots of "possibly"

RalfJ (Jun 28 2019 at 16:56, on Zulip):

true that

Shnatsel (Jun 28 2019 at 16:58, on Zulip):

ASAN aborts as soon as it finds something is definitely wrong. I have only wanted to circumvent that behavior once, and my use case might have been a bit crazy.

Shnatsel (Jun 28 2019 at 16:58, on Zulip):

This makes for a very simple user interface, and chains nicely with other tools.

RalfJ (Jun 28 2019 at 17:04, on Zulip):

hm. other people did ask for "suppression" support in Miri though. but maybe that's a niche use case.

Shnatsel (Jun 28 2019 at 17:05, on Zulip):

FWIW asan lets you suppress certain checks, like "one definition rule violation" check. That seems to be a C++ only thing, and sometimes complains about Rust code for whatever reason, so I just suppress it.

RalfJ (Jun 28 2019 at 17:13, on Zulip):

@Shnatsel you mentioned above "run miri multiple times", so you think it is reasonable that this might not be deterministic? we had some discussion about this in https://github.com/rust-lang/miri/issues/785

RalfJ (Jun 28 2019 at 17:13, on Zulip):

my reaction to that was "maybe we should make cargo miri test without extra flags have a fixed seed"

RalfJ (Jun 28 2019 at 17:13, on Zulip):

but then the warning above would have to be reworded to "run Miri multiple times with different seeds"

Shnatsel (Jun 28 2019 at 17:15, on Zulip):

I'd prefer to have it deterministic if possible. Using different seeds sounds like a way better idea.

RalfJ (Jun 28 2019 at 17:16, on Zulip):

kk

Shnatsel (Jun 28 2019 at 17:17, on Zulip):

I'm kind of used to the fact that deterministic execution is not really achievable, but for an interpreter such as MIRI it could actually be possible, and that's great

Shnatsel (Jun 28 2019 at 17:17, on Zulip):

I'd take deterministic execution over non-deterministic any day, all other things being equal

RalfJ (Jun 28 2019 at 17:17, on Zulip):

currently Miri doesnt support any communication with the outside world so its even fairly easy

RalfJ (Jun 28 2019 at 17:19, on Zulip):

my longer-term plan is that we will eventually allow opening files and such, and then of course its not deterministic any more. then when the program does getrandom we can also use OS randomness. (we'll still use a seed for our own non-determinism though, such as allocations.) but we can have a flag like -Zmiri-isolate that would turn off all of that communication and re-route getrandom to our own pseudo-random RNG.

RalfJ (Jul 05 2019 at 19:01, on Zulip):

@Shnatsel @centril see https://github.com/rust-lang/rust/issues/62420 for an issue closely related to this discussion (about being able to reliably detect misaligned pointer accesses)

Shnatsel (Jul 05 2019 at 19:29, on Zulip):

We can also take a third option: symbolic execution in MIRI

Ehsan M. Kermani (Jul 19 2019 at 15:42, on Zulip):

(deleted)

Ehsan M. Kermani (Jul 19 2019 at 15:45, on Zulip):

(deleted)

Last update: Nov 19 2019 at 18:15UTC