Stream: t-lang/wg-unsafe-code-guidelines

Topic: Generator optimization for locals that may not become live


Taylor Cramer (Jun 12 2019 at 21:10, on Zulip):

@RalfJ @tmandry

Taylor Cramer (Jun 12 2019 at 21:10, on Zulip):

context is allowing something like this optimization: https://github.com/rust-lang/rust/issues/59123#issuecomment-501089032

Taylor Cramer (Jun 12 2019 at 21:11, on Zulip):

and carving out a rule that could ban https://github.com/rust-lang/rust/issues/59123#issuecomment-501437202

Taylor Cramer (Jun 12 2019 at 21:11, on Zulip):

I think @tmandry is interested in looking at steps in the direction of a memory model that we could take that would allow this without committing to a full model

Taylor Cramer (Jun 12 2019 at 21:11, on Zulip):

obv. this would have to go through the RFC process and get aired thoroughly in this group

Taylor Cramer (Jun 12 2019 at 21:12, on Zulip):

but I figured I'd start the topic here for discussion

tmandry (Jun 12 2019 at 21:16, on Zulip):

One question on my mind is how/if we could find crates impacted by this

tmandry (Jun 12 2019 at 21:17, on Zulip):

e.g. is there a way to do a crater run with miri validation enabled? Have we ever tried?

simulacrum (Jun 12 2019 at 21:39, on Zulip):

not really currently but if it's strongly desired perhaps feasible in about 2-4 weeks (cc @Pietro Albini)

Pietro Albini (Jun 12 2019 at 21:40, on Zulip):

wasn't miri really slow?

Pietro Albini (Jun 12 2019 at 21:41, on Zulip):

like, a test run now takes 4-5 days, if we run it with miri how long is it going to take :/

centril (Jun 12 2019 at 22:45, on Zulip):

@Pietro Albini about 1000 times that maybe :slight_smile:

Pietro Albini (Jun 12 2019 at 22:53, on Zulip):

eeek no

tmandry (Jun 12 2019 at 22:58, on Zulip):

yeah, that's unfortunate. what kind of machine are we running crater on @Pietro Albini?

tmandry (Jun 12 2019 at 22:59, on Zulip):

is this doc still accurate? https://github.com/rust-lang-nursery/crater/blob/master/docs/agent-machine-setup.md

simulacrum (Jun 12 2019 at 23:03, on Zulip):

@tmandry mostly, yes

simulacrum (Jun 12 2019 at 23:03, on Zulip):

machine-spec wise anyway

simulacrum (Jun 12 2019 at 23:03, on Zulip):

although we might've downgraded to 1 terabyte recently not sure

Pietro Albini (Jun 12 2019 at 23:09, on Zulip):

we dowgraded from 4tb to 2tb, so that page is fully accurate

Pietro Albini (Jun 12 2019 at 23:12, on Zulip):

@tmandry a c5.2xlarge is 8 cores at ~3ghz

RalfJ (Jun 13 2019 at 08:20, on Zulip):

well also the vast majority of crates probably wont work in Miri for various reasons

RalfJ (Jun 13 2019 at 08:20, on Zulip):

most of them for trying to communicate with the host system in any way (network, file system, system clock, C FFI, ...)

RalfJ (Jun 13 2019 at 08:26, on Zulip):

I think tmandry is interested in looking at steps in the direction of a memory model that we could take that would allow this without committing to a full model

the good news is that (I think) "though shall not write through (frozen) shared references" is pretty much the least controversial part of stacked borrows. every violation of that that we found so far was a blatant typo.

However, this does interact with https://github.com/rust-lang/rust/issues/56604: is it allowed to write through a raw pointer obtained by turning an &mut T into a *const T? Current Miri says no.

Jake Goulding (Jun 13 2019 at 14:28, on Zulip):

Is there a possibility for a "fast" mode with Miri that could tell you if a program run would exceed Miri's abilities? Valgrind has a no-op mode (only 50% the original programs speed!) in a similar vein.

If so, you could run "miri check-effectiveness" on all the crates, establish a blacklist for those which don't pass, and then make a decision on whether to run on the remainder.

Jake Goulding (Jun 13 2019 at 14:29, on Zulip):

You could also sample crates to get some amount of testing

RalfJ (Jun 13 2019 at 14:40, on Zulip):

Miri's "fast mode" is -Zmiri-disable-validation. The overhead there is just around 300x-500x, not 1000x...

RalfJ (Jun 13 2019 at 14:41, on Zulip):

there is also probably a lot of optimization potential there. Like, I have not even profiled Miri. I don't have the resources for that so it is not something I focus on.

RalfJ (Jun 13 2019 at 14:42, on Zulip):

so, this needs someone who is willing to dedicate some time to look into how to make Miri faster. Without sacrificing readability or code organization, because correctness is still crucial. But even then, I would not expect it to become more than 10x faster -- so still way too slow. oky, much of crater's time is probably spent building, not running tests, so that one does not multiply.

RalfJ (Jun 13 2019 at 14:43, on Zulip):

more realistically, I actually talked with Julian about having a valgrind module for Stacked Borrows. We think it's possible. But I won't have the time to do it.

Taylor Cramer (Jun 13 2019 at 15:18, on Zulip):

Any particular reason for the valgrind preference over sanitizers? I remember asking this before, but I don't remember the answer

gnzlbg (Jun 13 2019 at 15:28, on Zulip):

If we want to make UB detection only 2x-10x slower, we are probably having to be able to build "self-instrumented" binaries directly, e.g., have MIR passes that inserts assert!s for UB violations into the MIR, and then lower the "self-instrumented" MIR to LLVM-IR and optimize it properly.

This could allow people to build "fortified" binaries that panic! before invoking undefined behavior with minimal performance impact. I don't think we can make the "evaluator" approach that miri uses fast, unless we essentially transform it into a tool that does the above.

RalfJ (Jun 13 2019 at 15:38, on Zulip):

Any particular reason for the valgrind preference over sanitizers? I remember asking this before, but I don't remember the answer

valgrind is something I have used before, sanitizers not. valgrind could also work when linking with unmodified C code.

RalfJ (Jun 13 2019 at 15:38, on Zulip):

and anyway the Rust code would need some instrumentation so its more of a valgrind-sanitizer hybrid

gnzlbg (Jun 13 2019 at 15:39, on Zulip):

valgrind could also work when linking with unmodified C code.

It would, but you probably don't want to apply Rust validation to it.

RalfJ (Jun 13 2019 at 15:39, on Zulip):

@gnzlbg aren't you describing a sanitizer?

RalfJ (Jun 13 2019 at 15:39, on Zulip):

valgrind could also work when linking with unmodified C code.

It would, but you probably don't want to apply Rust validation to it.

I think you do! When doing FFI from Rust, memory accesses by C are important to consider for Stacked Borrows.

RalfJ (Jun 13 2019 at 15:40, on Zulip):

they would all be "raw ptr accesses", so if the Rust code passes in raw ptrs like it should, the C code would have no restrictions

gnzlbg (Jun 13 2019 at 15:40, on Zulip):

I don't know, internally, the C code might do a lot of things that don't satisfy our model, but are ok in C, and that's ok as long as those don't leak to Rust.

RalfJ (Jun 13 2019 at 15:40, on Zulip):

oh sure, the valgrind mode wouldnt do anything for locations not allocated by or passed to Rust

gnzlbg (Jun 13 2019 at 15:40, on Zulip):

I think i would prefer an hybrid only instrumenting Rust code, and then using annotations on FFI to express assumptions about what the FFI code does

RalfJ (Jun 13 2019 at 15:41, on Zulip):

but inherently when doing FFI, Rust and C run on the same abstract machine.

RalfJ (Jun 13 2019 at 15:41, on Zulip):

we dont have a spec for that abstract machine and probably never will^^

gnzlbg (Jun 13 2019 at 15:42, on Zulip):

MemorySanitizer goes a different way, instrumenting the code you compile, and then letting you use run-time instrumentation on dynamically linked code (e.g. selectively using tools similar to valgrind only for non-instrumented code, and interfacing their results with the instrumented code)

RalfJ (Jun 13 2019 at 15:42, on Zulip):

but I foresee no problems with just doing stacked borrows in C, where everything is a raw ptr, no retagging ever happens -- then (UB-free) C code trivially satisfies Stacked Borrows

RalfJ (Jun 13 2019 at 15:42, on Zulip):

okay I admit I have no idea how sanitizers are implemented^^

RalfJ (Jun 13 2019 at 15:43, on Zulip):

but isnt the idea basically "compile to a self-checking binary"? the details of how much there are static assertions vs extra runtime in that binary dont really matter IMO, probably you need a hybrid anyway

gnzlbg (Jun 13 2019 at 15:43, on Zulip):

http://releases.llvm.org/3.3/tools/clang/docs/MemorySanitizer.html#id7 - DynamoRio is what they use

gnzlbg (Jun 13 2019 at 15:44, on Zulip):

@RalfJ yes the idea of compiling to a self-checked binary is the same

gnzlbg (Jun 13 2019 at 15:45, on Zulip):

at some point you need to interface with dynamically linked C libraries, and whether those are instrumented or not, and whether the Rust instrumentation can handle either case, will depend on what we do

gnzlbg (Jun 13 2019 at 15:47, on Zulip):

I think that if we can get by without having to instrument external libraries, that would be a big pro

gnzlbg (Jun 13 2019 at 15:48, on Zulip):

E.g. annotating FFI functions with their semantics "somehow", but I don't know if this can work

gnzlbg (Jun 13 2019 at 15:49, on Zulip):

We are going to need such annotations for syscalls anyways (or an alternative approach, like somehow intercepting those)

Last update: Nov 20 2019 at 11:35UTC