Stream: t-lang/wg-unsafe-code-guidelines

Topic: Can we replace the stack in "stacked borrows" by Drop?


Alan Jeffrey (May 13 2019 at 15:01, on Zulip):

Moving a conversation with @RalfJ into its own thread...

Alan Jeffrey (May 13 2019 at 15:01, on Zulip):

The conversation was in https://rust-lang.zulipchat.com/#narrow/stream/136281-t-lang.2Fwg-unsafe-code-guidelines/topic/Slides.20for.20talk.20on.20unsafe.20rust

Alan Jeffrey (May 13 2019 at 15:02, on Zulip):

and was about whether we could use the drop mechanism that already exists in Rust to replace the per-address stacks that are in stacked borrows

Alan Jeffrey (May 13 2019 at 15:03, on Zulip):

A draft implementation (missing a lot of Rust) is at https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=90773fba2b71073841015190bc8c1fdb

Alan Jeffrey (May 13 2019 at 15:06, on Zulip):

in particular, as @RalfJ pointed out, it's missing interior mutability, and in particular that interior mutability allows the same memory address to have multiple simultanous access modes, e.g. a &RefCell<u32> may have the same address as a &mut u32.

Alan Jeffrey (May 13 2019 at 15:09, on Zulip):

There's also the point of ergonomics and integration with Miri, since this introduces "shadow drop code" to types that don't currently have Drop, e.g. when a borrow or reborrow finishes, the metadata is updated, and this is something that Miri doesn't track.

Alan Jeffrey (May 13 2019 at 15:09, on Zulip):

@RalfJ does that seem like a fair summary of where we are?

RalfJ (May 13 2019 at 18:51, on Zulip):

:+1:

RalfJ (May 13 2019 at 18:52, on Zulip):

there's also the question about &T which is Copy and hence cannot be Drop. you proposed some non-Copy variant of shared references but I am unconvinced of that^^

Alan Jeffrey (May 13 2019 at 20:43, on Zulip):

@RalfJ yes, we'd need some non-copy marker for updating the metadata of memory addresses when it is dropped.

Alan Jeffrey (May 13 2019 at 20:44, on Zulip):

I'm looking to see how difficult it'll be to get your example with RefCell into the code I wrote yesterday. We'll see.

Jake Goulding (May 13 2019 at 21:07, on Zulip):

Clearly this code is all wrong: Colour... :stuck_out_tongue_wink:

Alan Jeffrey (May 14 2019 at 03:09, on Zulip):

@RalfJ OK, I got your example with aliased RefCell to go through: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=4007d27c87e5b03dd1c3ff6a4fd501a6

Alan Jeffrey (May 14 2019 at 03:10, on Zulip):

The trick here is that a RecCell<T> has at least one more byte of memory than a T, and we can use that byte to keep the metadata for the RefCell<T> separate from the metadata for the T.

Alan Jeffrey (May 14 2019 at 03:11, on Zulip):

This trick wouldn't work for UnsafeCell<T>.

Alan Jeffrey (May 14 2019 at 03:12, on Zulip):

Small edit to remove some dead code: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8597244d346a2e82a45a6a731f74136b

RalfJ (May 14 2019 at 07:12, on Zulip):

The trick here is that a RecCell<T> has at least one more byte of memory than a T, and we can use that byte to keep the metadata for the RefCell<T> separate from the metadata for the T.

so, the model needs to specifically know RefCell and color it differently?

RalfJ (May 14 2019 at 07:13, on Zulip):

Stacked Borrows does need to know (for shared references only) whether something is inside an UnsafeCell or not, that is inevitable -- but the model cannot have any idea about the fact that RefCell's borrow counter has anything to do with that. in fact, that one is in an UnsafeCell as well, so it should get the exact same treatment.

RalfJ (May 14 2019 at 07:15, on Zulip):

the goal of this model is to enable the compiler to do better program analysis (alias analysis specifically). I dont think that can work if the model is customizable with user code -- then the compiler has to understand that user code to know what information it can even deduce.

Alan Jeffrey (May 14 2019 at 14:12, on Zulip):

OK, so to treat this properly, I'd need to implement a MyRefCell<T> in order to get at the private fields of the ref cell, one of which would be an UnsafeCell<T>.

Alan Jeffrey (May 14 2019 at 14:15, on Zulip):

My guess is that this is doable, using the same sort of model as before, that cell.get() will put down a marker whose drop code will update the metadata.

Alan Jeffrey (May 14 2019 at 14:22, on Zulip):

It might not even need that, the markers for casting *mut T to &T or &mut T might be enough, we'll see.

Alan Jeffrey (May 14 2019 at 14:22, on Zulip):

@RalfJ I'll have a shot at this tonight.

Alan Jeffrey (May 14 2019 at 14:23, on Zulip):

@RalfJ BTW, thanks for walking this through with me, it's really helping me to understand stacked borrows!

RalfJ (May 14 2019 at 14:39, on Zulip):

I'm curious where it'll lead. :D And maybe there is some kind of equivalence or so, maybe some way to justify some of my arbitrary decisions.

But I also have to say that I personally don't like "ghost code"-style stuff. I've held this opinion mostly as a matter of principle for some time, and then recently in my research we did "ghost code"-style stuff for the first time... and it's awful, I see myself 100% confirmed. So I think if a more complex "instrumented state" like a stack lets us avoid having to insert code in complex ways, I think that alone would be totally worth it.

Alan Jeffrey (May 14 2019 at 15:30, on Zulip):

@RalfJ yeah, there may be a trade-off where the metadata can be made simpler but at the cost of introducing ghost code.

RalfJ (May 14 2019 at 15:32, on Zulip):

basically, you are encoding the stack implicitly in the well-nested control flow / RAII you are introducing

Alan Jeffrey (May 14 2019 at 15:36, on Zulip):

@RalfJ Miri does have StorageLive / Storage

Alan Jeffrey (May 14 2019 at 15:36, on Zulip):

Dead

RalfJ (May 14 2019 at 15:36, on Zulip):

it does

Alan Jeffrey (May 14 2019 at 15:36, on Zulip):

We could hang metadata updates on those.

RalfJ (May 14 2019 at 15:36, on Zulip):

that doesnt really help for heap-allocated stuff though

RalfJ (May 14 2019 at 15:37, on Zulip):

also, not all variables have these annotations

Alan Jeffrey (May 14 2019 at 15:38, on Zulip):

@RalfJ yeah, there's probably cases of, e.g. &mut to &, conversions that don't end up being tracked by Miri :/

RalfJ (May 14 2019 at 15:39, on Zulip):

and even then, I feel it is better to tell people "your reference is considered live until it is used the last time" rather than "your reference is considered live until a point which the compiler magically determines and which you cannot see". I am also worried about how this would permanently fix where we put these "end of life"-annotations; moving them either way could break code.

Alan Jeffrey (May 14 2019 at 15:40, on Zulip):

@RalfJ stacked borrows has that a reference is considered live until a use of an earlier reference?

Alan Jeffrey (May 14 2019 at 15:42, on Zulip):

e.g. in let x = &mut 37; { let y = &*x; ... } *x += 1; it's the use of x that ends the lifetime of y?

RalfJ (May 14 2019 at 15:42, on Zulip):

I'd say it is considered "live" until it is used the last time

RalfJ (May 14 2019 at 15:42, on Zulip):

and then using an earlier reference while the child is still live is UB

RalfJ (May 14 2019 at 15:42, on Zulip):

but there are many ways to say the same thing here

RalfJ (May 14 2019 at 15:43, on Zulip):

and to be fair, my characterization misplaces where the UB happens -- it happens when the child gets used again, as only then we really know that it was live all along

RalfJ (May 14 2019 at 15:44, on Zulip):

so in this view, the stack tracks "pointers that are allowed to still be live"

Alan Jeffrey (May 14 2019 at 15:44, on Zulip):

yes, the question is where that stack lives,

Alan Jeffrey (May 14 2019 at 15:44, on Zulip):

in stacked borrows each memory address has its own stack,

RalfJ (May 14 2019 at 15:45, on Zulip):

I prefer this view when explaining conflicts to people: "see, you have this pointer here that you used again there, so clearly it was live all the time, but then there is this other conflicting access that happened while the pointer was live"

Alan Jeffrey (May 14 2019 at 15:45, on Zulip):

it would be nice if we could use the call stack for this.

RalfJ (May 14 2019 at 15:45, on Zulip):

it would be nice if we could use the call stack for this.

fair

RalfJ (May 14 2019 at 15:46, on Zulip):

but given that it's not even really a stack any more, I doubt that

RalfJ (May 14 2019 at 15:46, on Zulip):

I am inserting and removing "in the middle" on a few occasions and found no way around that

Alan Jeffrey (May 14 2019 at 15:47, on Zulip):

Is it interior mutability that causes those updates "in the middle"?

RalfJ (May 14 2019 at 15:49, on Zulip):

the example I posted with aliasing mutable/shared refs is one case

RalfJ (May 14 2019 at 15:49, on Zulip):

two-phase borrows is another

RalfJ (May 14 2019 at 15:49, on Zulip):

there are probably more but I am not sure

Alan Jeffrey (May 14 2019 at 15:52, on Zulip):

OK, I'll have a look at 2-phase borrows.

Alan Jeffrey (May 14 2019 at 15:52, on Zulip):

I'll get a chance to think about this tonight.

Alan Jeffrey (May 14 2019 at 15:53, on Zulip):

Hopefully.

RalfJ (May 14 2019 at 16:06, on Zulip):

what would help me in understanding this is: (a) some kind of trace of what is going on in that RefCell thing -- like, annotating the code with how the colour changes, or so. I find that had to extract from the implementation. (b) some kind of argument along the lines of https://www.ralfj.de/blog/2018/11/16/stacked-borrows-implementation.html#5-key-properties: why do the checks that you are inserting guarantee that some unknown code does not mess with our memory?

Alan Jeffrey (May 14 2019 at 17:00, on Zulip):

@RalfJ a trace should be pretty easy. A proof of correctness might be a bit trickier :)

RalfJ (May 14 2019 at 17:08, on Zulip):

well, I was more thinking of a reasonable argument than a full formal proof. ;)

RalfJ (May 14 2019 at 17:09, on Zulip):

I wouldn't call what I got in that blog post a proof, that's a proof sketch at best.^^

Alan Jeffrey (May 14 2019 at 17:14, on Zulip):

@RalfJ fair enough.

Alan Jeffrey (May 15 2019 at 18:20, on Zulip):

I reliazed that if we think of &T as being a refcounting type where the refcount is metadata then we get a very simple model (the metadata for each byte is the same as for Rc<u8>). https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0bc4a04c1186ee641450be1b03aaa031

Alan Jeffrey (May 15 2019 at 18:22, on Zulip):

Does suffer from the same problem as before, which is that we have to think of &T as a type which implements Clone and Drop, even though those only update the metadata.

Alan Jeffrey (May 15 2019 at 18:22, on Zulip):

But the model is really really simple.

Alan Jeffrey (May 15 2019 at 18:23, on Zulip):

Oh, and I've not looked at UnsafeCell yet, I don't thiiiiiink there's anything more needed, but devil in the details.

Alan Jeffrey (May 15 2019 at 18:24, on Zulip):

@RalfJ ^

Alan Jeffrey (May 15 2019 at 18:31, on Zulip):

Hmm, in fact there may be a way to implement all of this in safe rust (except for the casts from *T to *U which are blatantly unsafe).

Alan Jeffrey (May 15 2019 at 18:34, on Zulip):

This reminds me of ECSs, in that it implements refcounting, but keeps all the refcounts together. I wonder if there's an implementation of an arena that already does this?

RalfJ (May 15 2019 at 20:46, on Zulip):

I reliazed that if we think of &T as being a refcounting type where the refcount is metadata then we get a very simple model (the metadata for each byte is the same as for Rc<u8>).

this reminds me of some of the things I did back in 2017 with my first attempt at defining what references may and may not do. I had RwLocks back then. But unfortunately that turned out not enough when you have RefCell in RefCell in RefCell... each of them is basically a new layer of RwLock.

Alan Jeffrey (May 15 2019 at 20:58, on Zulip):

The good news is that each level of RefCell has at least one byte of its own data, which can be coloured independently of the contents.

Alan Jeffrey (May 15 2019 at 20:59, on Zulip):

I should look to see whether I can code up a MyRefCell<T> using UnsafeCell<T> and get it to fly.

RalfJ (May 15 2019 at 21:01, on Zulip):

The good news is that each level of RefCell has at least one byte of its own data, which can be coloured independently of the contents.

RefCell is just an example, this can happen without any extra data in unsafe code

RalfJ (May 15 2019 at 21:01, on Zulip):

the model cannot know where the synchronization is coming from

RalfJ (May 15 2019 at 21:03, on Zulip):

and in code like this, there is no extra byte:

use std::cell::UnsafeCell;

// Two-phase borrows of the pointer returned by UnsafeCell::get() should not
// invalidate aliases.
fn main() { unsafe {
    let x = &UnsafeCell::new(0);
    let x2 = &*x;
    // We can have a unique reference
    let uniq_ref = &mut *x.get();
    *uniq_ref = 2;
    // and then use our shared references again
    let _val = *x2.get();
} }
Alan Jeffrey (May 15 2019 at 22:04, on Zulip):

@RalfJ I coded up your example ^ and got it to run: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8d190468b1d048261cedce9ac9b2f87c

Alan Jeffrey (May 15 2019 at 22:05, on Zulip):

There is an explicit drop(uniq_ref), without that it UBs.

Alan Jeffrey (May 15 2019 at 22:06, on Zulip):

Same trade-off, simpler model, but uses clone and drop.

RalfJ (May 16 2019 at 07:26, on Zulip):

I am afraid without a tracce I have no idea what is going on :/ it would take forever to dig through that code. But I am convinced that your model inherently does not have enough state to properly model nested UnsafeCell

RalfJ (May 16 2019 at 07:26, on Zulip):

I can have N shared refs to the outer cell and M to the inner, (and K to the 3rd nested UnsafeCell and so on) and these two "refcounts" must be tracked separately

RalfJ (May 16 2019 at 07:26, on Zulip):

I don't see that happen, so something must be missing somewhere.

RalfJ (May 16 2019 at 07:28, on Zulip):

and then what about raw pointers, do you plan to have refcounts for those as well?

RalfJ (May 16 2019 at 07:29, on Zulip):

Also I see no checks that happen on accesses? That does not seem right, sometimes the only difference between okay and UB is whether some access is a read or a write.

RalfJ (May 16 2019 at 07:31, on Zulip):

@Alan Jeffrey like this:

fn main() { unsafe {
   let x = &mut 0;
   let y = &*(&*x as *const _); // lifetime laundering
   let _val = *x; // Reading from x...
   let _val = *y; // ...keeps y valid.
   *x = 1; // Writing to x...
   let _val = *y; // ...kills y. This line is UB.
} }
Alan Jeffrey (May 16 2019 at 15:03, on Zulip):

@RalfJ the example does have a trace showing the colourings, are you looking for more detail?

Alan Jeffrey (May 16 2019 at 15:04, on Zulip):

the refcounts are on memory addresses, not on values, so raw pointers don't carry refcounts, but the memory pointed to by the raw pointer does.

Alan Jeffrey (May 16 2019 at 15:06, on Zulip):

one thing that's interesting about the refcount model is that references are just integers, all the metadata is tracked in the memory being referred to.

Alan Jeffrey (May 16 2019 at 15:06, on Zulip):

Making the drop explicit, your example is:

Alan Jeffrey (May 16 2019 at 15:06, on Zulip):

Alan Jeffrey (May 16 2019 at 15:07, on Zulip):

sigh, Enter == send message, sigh

Alan Jeffrey (May 16 2019 at 15:08, on Zulip):
fn main() { unsafe {
   let x = &mut 0;
   let y = &*(&*x as *const _); // lifetime laundering
   let _val = *x; // Reading from x...
   let _val = *y; // ...keeps y valid.
   *x = 1; // Writing to x...
   let _val = *y; // ...kills y. This line is UB.
  drop(y); // At this point we drop y
  drop(x);
} }
Alan Jeffrey (May 16 2019 at 15:09, on Zulip):

whereas the version that doesn't UB is:

fn main() { unsafe {
   let x = &mut 0;
   let y = &*(&*x as *const _); // lifetime laundering
   let _val = *x; // Reading from x...
   let _val = *y; // ...keeps y valid.
  drop(y); // At this point we drop y
   *x = 1; // Writing to x...
  drop(x);
} }
Alan Jeffrey (May 16 2019 at 15:10, on Zulip):

That is, it's when the drop code runs that determines whether the behaviour is UB or not.

Alan Jeffrey (May 16 2019 at 15:10, on Zulip):

(which brings us to the core trade-off, simpler model vs UB based on invisible shadow code)

RalfJ (May 16 2019 at 15:16, on Zulip):

RalfJ the example does have a trace showing the colourings, are you looking for more detail?

d'oh, I never thought of running it^^

Alan Jeffrey (May 16 2019 at 15:17, on Zulip):

:)

RalfJ (May 16 2019 at 15:19, on Zulip):

one thing that's interesting about the refcount model is that references are just integers, all the metadata is tracked in the memory being referred to.

I think that can inherently not protected against "misbehaving" unsafe code though... like, in

fn foo(x: &mut i32) -> i32 {
  *x = 4;
  bar();
  return *x;
}
RalfJ (May 16 2019 at 15:20, on Zulip):

we want to be sure that bar cannot write to this memory -- but if it has a raw pointer to that memory, without having metadata in the pointer, how would the model distinguish that from using x to access the same memory (which is legitimate)?

RalfJ (May 16 2019 at 15:23, on Zulip):

(which brings us to the core trade-off, simpler model vs UB based on invisible shadow code)

well at this point you also have consider the pass adding the shadow code to be part of the model, in terms of complexity. and it seems that has to be quite complex. like, how would it know when exactly to drop y here, when lifetimes are unreliable?

RalfJ (May 16 2019 at 15:25, on Zulip):

so re: the example you posted, why does it not colour the memory unique when I create a mutable reference? seems like uniq_ref is "less unique" than 'cell'?

Alan Jeffrey (May 16 2019 at 15:25, on Zulip):

(which brings us to the core trade-off, simpler model vs UB based on invisible shadow code)

well at this point you also have consider the pass adding the shadow code to be part of the model, in terms of complexity. and it seems that has to be quite complex. like, how would it know when exactly to drop y here, when lifetimes are unreliable?

True, though that complexity is a lot like the existing complexity of when does Drop code get inserted.

RalfJ (May 16 2019 at 15:26, on Zulip):

TBH I doubt that, drop only has to work with moves, not borrows, and it is specifically not trying to catch misbehaving unsafe code

RalfJ (May 16 2019 at 15:26, on Zulip):

like, if you drop(ptr::read(&mut local)), local will get double-dropped

Alan Jeffrey (May 16 2019 at 15:28, on Zulip):

I said "a lot like" not "exactly the same as" :)

RalfJ (May 16 2019 at 15:28, on Zulip):

I know :)

RalfJ (May 16 2019 at 15:29, on Zulip):

and btw if you are looking for more testcases, I got a few at https://github.com/rust-lang/miri/tree/master/tests/run-pass/stacked-borrows and https://github.com/rust-lang/miri/tree/master/tests/compile-fail/stacked_borrows

Alan Jeffrey (May 16 2019 at 15:30, on Zulip):

Ta.

RalfJ (May 16 2019 at 15:30, on Zulip):

"Ta."?

Alan Jeffrey (May 16 2019 at 15:31, on Zulip):

Thanks for the testcases.

RalfJ (May 16 2019 at 15:31, on Zulip):

TIL: https://www.dict.cc/?s=Ta

RalfJ (May 16 2019 at 15:31, on Zulip):

never saw that before^^ but it says "Br/Aus", so I guess that explains why ;)

Alan Jeffrey (May 16 2019 at 15:31, on Zulip):

I am so unaware of what words are colloquial.

Alan Jeffrey (May 16 2019 at 15:35, on Zulip):

I'll have another think about the refcounted memory model, hopefully tonight.

Alan Jeffrey (May 16 2019 at 15:36, on Zulip):

I have GL errors from VR browsers to sort out atm :)

Alan Jeffrey (May 17 2019 at 15:22, on Zulip):

A simplified version, which keeps track of how many unique refs and shared refs exist: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=7bb59a62c3713efc40b6b9d6e90e6b20

Alan Jeffrey (May 17 2019 at 15:23, on Zulip):

There's a count of unique references due to reborrowing, which increments the count of unique refs.

Alan Jeffrey (May 17 2019 at 15:25, on Zulip):

If they're both non-zero, that's because a &mut T got reborrowed as a &T (there's a Frozen marker put on the stack to check that there's no live &Ts when the &mut T becomes live again).

Alan Jeffrey (May 17 2019 at 15:27, on Zulip):

Same objections as before... this needs to keep track of when references are cloned, reborrowed and dropped, which Miri may not be very happy about :/

Alan Jeffrey (May 17 2019 at 15:29, on Zulip):

But the model is very simple, and references are just integers, all the metadata is attached to memory addresses.

Alan Jeffrey (May 17 2019 at 15:30, on Zulip):

Next stage is probably trying more examples.

Alan Jeffrey (May 17 2019 at 15:31, on Zulip):

I'm going to be in SF next week for S&P (hearing about Spectre and RIDL and talking about https://github.com/chicago-relaxed-memory/spec-eval/blob/master/doc/paper.pdf) so the next step won't be for a bit.

Alan Jeffrey (May 17 2019 at 15:31, on Zulip):

@RalfJ ^

RalfJ (May 17 2019 at 17:47, on Zulip):

@Alan Jeffrey thanks!

RalfJ (May 17 2019 at 17:48, on Zulip):

but this means that with nested UnsafeCell you do not distinguish shared references from the different levels. there's a lot of complexity there that you are just not implementing -- or maybe moving into the pass that inserts the drops? not sure.

RalfJ (May 17 2019 at 17:49, on Zulip):

also what if the position of the drop cannot be predicted statically? let me revive one of my earlier examples where I said you have to distinguish pointers

RalfJ (May 17 2019 at 17:50, on Zulip):
fn main() { unsafe {
   let x = &mut 0;
   let y = &*(&*x as *const _); // lifetime laundering
   let _val = *x; // Reading from x...
   let _val = *y; // ...keeps y valid.
   *x = 1; // Writing to x...
   if some_undecidable_function_always_returning_false() {
     let _val = *y; // ...kills y. This line is UB.
   }
} }

you'd have to insert drop(y) after the if, right? but that means there's an error because x gets used while y is still "live"

RalfJ (May 17 2019 at 17:54, on Zulip):

I'm going to be in SF next week for S&P (hearing about Spectre and RIDL and talking about https://github.com/chicago-relaxed-memory/spec-eval/blob/master/doc/paper.pdf) so the next step won't be for a bit.

@Alan Jeffrey oh that's interesting. is that related to the "RFUB" (read from untaken branch) example that came up in relaxed memory discussions recently? you mention the hardware attacks but then the directory says sth about relaxed memory^^

RalfJ (May 17 2019 at 17:55, on Zulip):

hm, no, seems unrelated. I wonder if there is a connection though...

RalfJ (May 17 2019 at 17:56, on Zulip):

RFUB is described here: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1217r0.html

Alan Jeffrey (May 17 2019 at 20:07, on Zulip):

Hmm, RFUB looks interesting. There's a lot of work on speculative evaluation atm it's very hard to keep up with all of it!

Alan Jeffrey (May 17 2019 at 20:10, on Zulip):

The position that the refcounted borrows model takes (at least at the moment) is that &UnsafeCell<T> is not tracked at all, in the same way that &() isn't tracked.

Alan Jeffrey (May 17 2019 at 20:12, on Zulip):

I am sort of thinking that "As T is to &T so UnsafeCell<T> is to *T.

Alan Jeffrey (May 17 2019 at 20:18, on Zulip):

Hmm, interesting example, that would indeed be a case that refcounting as it currently stands can't handle.

Alan Jeffrey (May 17 2019 at 20:18, on Zulip):

I wonder if there's an optimization that refcounting borrows validates that stacked borrows doesn't?

Alan Jeffrey (May 17 2019 at 20:21, on Zulip):

Something like replacing:

if (c) { *x = v; }

by:

*x = if (c) { v } else { *x };
Alan Jeffrey (May 17 2019 at 20:21, on Zulip):

@RalfJ ^

RalfJ (May 17 2019 at 20:25, on Zulip):

assuming *x: &mut, Stacked Borrows allows this

RalfJ (May 17 2019 at 20:28, on Zulip):

Hmm, RFUB looks interesting. There's a lot of work on speculative evaluation atm it's very hard to keep up with all of it!

I wonder why that is... ;)

RalfJ (May 17 2019 at 20:29, on Zulip):

The position that the refcounted borrows model takes (at least at the moment) is that &UnsafeCell<T> is not tracked at all, in the same way that &() isn't tracked.

@Alan Jeffrey I see. But doesn't it need tracking insofar as it gets in the way of aliasing &mut being unique? the &UnsafeCell itself allows all sorts of aliasing (same in Stacked Borrows), it doesnt care, but others care that &UnsafeCell is not used to break their properties.

Alan Jeffrey (May 17 2019 at 20:49, on Zulip):

assuming *x: &mut, Stacked Borrows allows this

Er does it? Consider putting it in parallel with

unsafe { *(p as &mut u8) = 5; }
Alan Jeffrey (May 17 2019 at 20:50, on Zulip):

argh hit return rather than shift-return!

Alan Jeffrey (May 17 2019 at 20:52, on Zulip):

Consider putting it in parallel with:

unsafe  {  if(!c) { *(p as &mut u8) = 5; } }

where p = (x as *mut u8).

Alan Jeffrey (May 17 2019 at 20:53, on Zulip):

The first is UB-free, the second has UB, so we can't consider them to be the same as far as optimization is concerned?

Alan Jeffrey (May 17 2019 at 20:54, on Zulip):

Or are you thinking of a different equivalence relation for "allowed optimization"?

Alan Jeffrey (May 17 2019 at 20:54, on Zulip):

@RalfJ ^

lqd (May 17 2019 at 20:57, on Zulip):

(@Alan Jeffrey there's a checkbox next to the Send button, "Press Enter to send", which you can uncheck and should help with the shift-return annoyance. Also you can edit a message, there's a menu entry in the chevron "message actions" thingy to the right of a message when you hover it, and also a little pencil icon which directly allows editing — I'm assuming here you're not on mobile, that is)

RalfJ (May 19 2019 at 12:24, on Zulip):

(you can also edit the last message you sent by hitting the "left" arrow key)

RalfJ (May 19 2019 at 20:31, on Zulip):

Consider putting it in parallel with:

unsafe  {  if(!c) { *(p as &mut u8) = 5; } }

where p = (x as *mut u8).

@Alan Jeffrey WDYM "in parallel"? in a different thread?
I assumed x is a mutable ref. Having a mutable ref in scope is sufficient to permit dereferencing it, we even set an attribute in LLVM for that

RalfJ (May 19 2019 at 20:31, on Zulip):

when x gets in scope, it gets retagged; if another thread can write to taht location we have a data race -> UB

Alan Jeffrey (May 28 2019 at 13:26, on Zulip):

Back after S&P and Memorial Day...

Alan Jeffrey (May 28 2019 at 13:29, on Zulip):

Back to the example, which was is rustc allowed to to optimize the safe code (where x: &mut u8):

if (c) { *x =v; }

as:

*x = if (c) { v } else { *x }:

?

Alan Jeffrey (May 28 2019 at 13:29, on Zulip):

The problem being if there is another thread with access to p = (x as *mut u8) and does:

Alan Jeffrey (May 28 2019 at 13:31, on Zulip):
unsafe  {  if(!c) { *(p as &mut u8) = 5; } }
Alan Jeffrey (May 28 2019 at 13:33, on Zulip):

The original has no UB, since if c is true, the safe code writes to x and the unsafe code does nothing, and vice versa if c is false,

Alan Jeffrey (May 28 2019 at 13:33, on Zulip):

but the "optimized" version has UB since there's a race condition.

RalfJ (May 28 2019 at 13:33, on Zulip):

it is UB to write to memory that other threads have an &mut to

RalfJ (May 28 2019 at 13:33, on Zulip):

that violates the exclusive nature of &mut

RalfJ (May 28 2019 at 13:34, on Zulip):

this is just the same as having

fn foo(x: &mut i32) {
  bar(); // it is UB if bar writes to `x`
}
Alan Jeffrey (May 28 2019 at 13:34, on Zulip):

@RalfJ even if the &mut isn't written to?

RalfJ (May 28 2019 at 13:34, on Zulip):

@Alan Jeffrey so first, let's get concurrency out of the way. do you agree that my example is the same?

RalfJ (May 28 2019 at 13:34, on Zulip):

all the issues arise sequentially as well

RalfJ (May 28 2019 at 13:35, on Zulip):

imagine bar(); if (c) { *x =v; } bar();

Alan Jeffrey (May 28 2019 at 13:35, on Zulip):

@RalfJ yes, the example is essentially the same,

RalfJ (May 28 2019 at 13:36, on Zulip):

okay. and yes this is UB under current stacked borrows, precisely because we ant to be able to introduce spurious reads and to reorder around unknown function calls.

Alan Jeffrey (May 28 2019 at 13:36, on Zulip):

the question is whether it's x being in scope that triggers the UB, or an access to x that does.

RalfJ (May 28 2019 at 13:36, on Zulip):

like, the goal is to make reordering the if (c) { *x =v; } up or down both legal

RalfJ (May 28 2019 at 13:36, on Zulip):

(assuming c and v have no side-effects etc.)

RalfJ (May 28 2019 at 13:36, on Zulip):

and for that we need "x is in scope" to be enough for the UB

RalfJ (May 28 2019 at 13:37, on Zulip):

and that's why Stacked Borrows currently says it is

RalfJ (May 28 2019 at 13:37, on Zulip):

(this is the "retagging" stuff from my earlier blogposts)

RalfJ (May 28 2019 at 13:37, on Zulip):

specifically, §3 of https://www.ralfj.de/blog/2018/11/16/stacked-borrows-implementation.html

Alan Jeffrey (May 28 2019 at 13:40, on Zulip):

Do we also have

fn foo(x: &mut i32) {
  bar(); // it is UB if bar reads  from`x`
}
RalfJ (May 28 2019 at 13:47, on Zulip):

yes

RalfJ (May 28 2019 at 13:50, on Zulip):

here's some code for you to play with: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=0660fc6baf5827ae3ff94ffa80dec42a

Alan Jeffrey (May 28 2019 at 13:50, on Zulip):

@RalfJ so back to your example:

fn main() { unsafe {
   let x = &mut 0;
   let y = &*(&*x as *const _); // lifetime laundering
   let _val = *x; // Reading from x...
   let _val = *y; // ...keeps y valid.
   *x = 1; // Writing to x...
   if some_undecidable_function_always_returning_false() {
     let _val = *y; // ...kills y. This line is UB.
   }
} }

isn't rustc allowed to hoist the read of y in the UB line out of the conditional?

RalfJ (May 28 2019 at 13:51, on Zulip):

no because it cannot prove that it will happen

RalfJ (May 28 2019 at 13:51, on Zulip):

and y is not an argument to the function so rust can't know how big its "scope" is

RalfJ (May 28 2019 at 13:52, on Zulip):

the optimizations above relied on the fact that these were fn arguments, which are assumed to have a "scope" that's at least the fn call

RalfJ (May 28 2019 at 13:52, on Zulip):

(that's the "barriers"/"protectors" from stacked borrows)

Alan Jeffrey (May 28 2019 at 13:52, on Zulip):

@RalfJ really? We're banning hoisting loads out of conditionals? This seems quite drastic!

RalfJ (May 28 2019 at 13:52, on Zulip):

well in C you cannot do that either unless you can prove that the load is okay

Alan Jeffrey (May 28 2019 at 13:54, on Zulip):

@RalfJ Can't you? I suspect a lot of C compilers do :(

RalfJ (May 28 2019 at 13:54, on Zulip):

though in this specific case C might be able to rule that some_undecidable_function_always_returning_false cannot possibly deallocate y and hence the load is okay. which is interesting. this will come up when figuring out how to formalize LLVM's noalias inside functions

RalfJ (May 28 2019 at 13:54, on Zulip):

@Alan Jeffrey I'd be surprised if for something like

int foo(bool b, int *i) {
  if (b) { int x = *i; }
}

any C compiler would hoist the load

RalfJ (May 28 2019 at 13:55, on Zulip):

it is easy to construct miscompilations from that

Alan Jeffrey (May 28 2019 at 13:56, on Zulip):

@RalfJ won't most compilers hoist loads out of a loop?

RalfJ (May 28 2019 at 13:58, on Zulip):

@Alan Jeffrey they may only do that if they can either prove that the loop iterates at least once, or have another argument for why that pointer is okay

RalfJ (May 28 2019 at 13:59, on Zulip):

this is one of the reasons why I want the aliasing stuff for all arguments even if they dont get used -- so that Rust can do such hoisting without such concerns

Alan Jeffrey (May 28 2019 at 14:06, on Zulip):

@RalfJ yes, I was thinking of hoisting out of do-loops rather than while loops.

RalfJ (May 28 2019 at 14:12, on Zulip):

yes that's a different situation

Last update: Nov 19 2019 at 18:55UTC