Stream: t-lang/wg-unsafe-code-guidelines

Topic: stacked borrows: barriers for non-frrozen shared refs?


RalfJ (Nov 21 2018 at 15:55, on Zulip):

Right now, my idea of barriers is that we make sure that in a fn foo(x: &mut i32), the item for this reference cannot get popped off the stack while the function is running. However, I cannot do that for &UnsafeCell, because I am pushing the barrier on top of the stack, but an &UnsafeCell can be fairly deep down the stack when it is passed to a function (like when there is also an &mut pointing into the same RefCell/Mutex). But that means that an &UnsafeCell could actually become invalid, and even deallocated, during the execution of a function -- in contradiction to the dereferencable attribute we set for LLVM. I see two options:

Cc @nikomatsakis @Gankro

Gankro (Nov 21 2018 at 15:55, on Zulip):

by barriers do we mean C11 concurrency stuff?

RalfJ (Nov 21 2018 at 15:56, on Zulip):

no, I just picked a bad name and cannot find a better one

RalfJ (Nov 21 2018 at 15:56, on Zulip):

this has nothing to do with concurrency

Gankro (Nov 21 2018 at 15:56, on Zulip):

cool

RalfJ (Nov 21 2018 at 15:56, on Zulip):

I described it in my first stacked borrows post

RalfJ (Nov 21 2018 at 15:56, on Zulip):

if you can come up with a better name, please let me know :)

Gankro (Nov 21 2018 at 16:12, on Zulip):

Oh interesting I had mentally filed the raw state as a strictly leaf one

RalfJ (Nov 21 2018 at 16:13, on Zulip):

are you sure you are replying in the right topic...?

Gankro (Nov 21 2018 at 16:13, on Zulip):

from your link "This could happen if the stack contained, somewhere, a Raw. "

RalfJ (Nov 21 2018 at 16:14, on Zulip):

ah

RalfJ (Nov 21 2018 at 16:14, on Zulip):

well you can reborrow a raw ptr to a mutable ref

Gankro (Nov 21 2018 at 16:14, on Zulip):

upon reflection it makes sense that raws can be sprinkled through the stack as we upgrade pointers to refs temporarily

RalfJ (Nov 21 2018 at 16:14, on Zulip):

not even so uncommon

Gankro (Nov 21 2018 at 16:15, on Zulip):

it's just a bit weird to think raw pointer accesses are always valid if there's a raw somewhere in the stack, it will just wildly invalidate a bunch of refs :)

RalfJ (Nov 21 2018 at 16:15, on Zulip):

true

RalfJ (Nov 21 2018 at 16:15, on Zulip):

that's why barriers restrict this :)

RalfJ (Nov 21 2018 at 16:16, on Zulip):

but generally, this kind of stuff happens in an access-based model

RalfJ (Nov 21 2018 at 16:16, on Zulip):

you get exclusivity guarantees for what happened between two accesses with your reference

RalfJ (Nov 21 2018 at 16:17, on Zulip):

if there was a raw ptr access in the mean time, even if it worked, it would have invalidated your ref, so the 2nd access would have been UB, so you can assume that does not happen

Gankro (Nov 21 2018 at 16:18, on Zulip):

Can you elaborate on how "an &UnsafeCell could actually become invalid, and even deallocated, during the execution of a function"

RalfJ (Nov 21 2018 at 16:21, on Zulip):

well e.g. someone could deallocate that memory

Gankro (Nov 21 2018 at 16:25, on Zulip):

Can you give some sample code? I can't picture the situation you're describing

RalfJ (Nov 21 2018 at 16:29, on Zulip):

here's a gross example:

fn inner(x: &UnsafeCell<i32>, f: fn()) {
    // `f` may mutate x because it is shared +  non-frozen - but is it okay to deallocate?
    f()
}

fn main() {
    let x = Box::into_raw(Box::new(UnsafeCell::new(0)));
    inner(&*x, |x| {
        drop(unsafe { Box::from_raw(x) });
    });
}
RalfJ (Nov 21 2018 at 16:30, on Zulip):

a realistic example is actually Arc, where we call fetch_sub to decrement the refcount. immediately after that decrement happened, and before fetch_sub returns, another thread could deallocate the ArcInner

Gankro (Nov 21 2018 at 16:32, on Zulip):

For the UnsafeCell example, I definitely had a mental model that function arguments had a phantom use at the start and end of the function

Gankro (Nov 21 2018 at 16:32, on Zulip):

(and therefore that code is super busted)

RalfJ (Nov 21 2018 at 16:32, on Zulip):

@Gankro oh well then let me make it even more gross

RalfJ (Nov 21 2018 at 16:32, on Zulip):
fn inner(x: &UnsafeCell<i32>, f: fn()) {
    // `f` may mutate x because it is shared +  non-frozen - but is it okay to deallocate?
    f()
   // we never get to the end
   loop {}
}

fn main() {
    let x = Box::into_raw(Box::new(UnsafeCell::new(0)));
    inner(&*x, |x| {
        drop(unsafe { Box::from_raw(x) });
    });
}
RalfJ (Nov 21 2018 at 16:33, on Zulip):

barriers "kick in" even if the end of the function is never reached

Gankro (Nov 21 2018 at 16:33, on Zulip):

Right sure

RalfJ (Nov 21 2018 at 16:33, on Zulip):

we have "phantom use at the beginning" in the form of retagging

Gankro (Nov 21 2018 at 16:34, on Zulip):

right

RalfJ (Nov 21 2018 at 16:34, on Zulip):

but not at the end

RalfJ (Nov 21 2018 at 16:34, on Zulip):

I found barriers more elegant than actually inserting Retag everywhere, I guess

RalfJ (Nov 21 2018 at 16:34, on Zulip):

hm no actually this doesnt work

Gankro (Nov 21 2018 at 16:34, on Zulip):

So are barriers not reified in mir like retag is?

RalfJ (Nov 21 2018 at 16:35, on Zulip):

consider

fn foo(x: &mut (u8, u8)) -> &mut u8 { &mut *x }

surely we dont want a phantom use of x at the end, that would invalidate the return value!

RalfJ (Nov 21 2018 at 16:35, on Zulip):

So are barriers not reified in mir like retag is?

retag has a bool parameter saying whether a barrier is pushed

Gankro (Nov 21 2018 at 16:35, on Zulip):

But barriers do handle foo properly?

RalfJ (Nov 21 2018 at 16:35, on Zulip):

then they are no the stack and affect accesses (but currently not derefs!)

RalfJ (Nov 21 2018 at 16:36, on Zulip):

so yes they are kind-of reified

RalfJ (Nov 21 2018 at 16:36, on Zulip):

but their effect is different from "retag in a few places"

RalfJ (Nov 21 2018 at 16:36, on Zulip):

But barriers do handle foo properly?

yes

RalfJ (Nov 21 2018 at 16:36, on Zulip):

(will be back in a couple minutes, need to catch a bus)

Gankro (Nov 21 2018 at 16:44, on Zulip):

Ok so if this is just asking "is it fine to set barriers on items that aren't the top of the stack", it seems perfectly fine to me?

RalfJ (Nov 21 2018 at 16:45, on Zulip):

that's one of the questions, yes -- going a bit (more) away from a strict stack discipline

RalfJ (Nov 21 2018 at 16:45, on Zulip):

and other question is: what do we do about Arc::drop?

Gankro (Nov 21 2018 at 16:46, on Zulip):

Although I'm not clear how/why the &UnsafeCell could both not be the top of the stack at the time of a barrier and stuff above it would be reasonable to pop?

RalfJ (Nov 21 2018 at 16:46, on Zulip):

if we indeed want &UnsafeCell to come with a barrier, to ensure that there is no deallocation, we must do something there

RalfJ (Nov 21 2018 at 16:46, on Zulip):

like, have a fetch_sub that takes a raw ptr

RalfJ (Nov 21 2018 at 16:47, on Zulip):

Although I'm not clear how/why the &UnsafeCell could both not be the top of the stack at the time of a barrier and stuff above it would be reasonable to pop?

"reasonable to pop"? my favourite RefCell example shows we must not pop stuff on top of a &UnsafeCell barrier:

fn aliasing_mut_and_shr() {
    fn inner(rc: &RefCell<i32>, aliasing: &mut i32) {
        *aliasing += 4;
        let _escape_to_raw = rc as *const _;
        *aliasing += 4;
        let _shr = &*rc;
        *aliasing += 4;
    }

    let rc = RefCell::new(23);
    let mut bmut = rc.borrow_mut();
    inner(&rc, &mut *bmut);
    drop(bmut);
    assert_eq!(*rc.borrow(), 23+12);
}
Gankro (Nov 21 2018 at 16:51, on Zulip):

It seems like that doesn't cover what I'm saying, but not sure

Gankro (Nov 21 2018 at 16:51, on Zulip):

need to sketch it

Gankro (Nov 21 2018 at 16:57, on Zulip):

ok let's put it this way: when would it be valid for calling a function to result in a stack reduction?

Gankro (Nov 21 2018 at 16:58, on Zulip):

naively it would seem that functions have to preserver the caller's stack, since they're opaque..?

Gankro (Nov 21 2018 at 16:58, on Zulip):

maybe i'm thinking too lifetimey?

RalfJ (Nov 21 2018 at 17:02, on Zulip):

well that's what barriers ensure

RalfJ (Nov 21 2018 at 17:02, on Zulip):

that during a function call, we only pop stuff that was pushed during the call

RalfJ (Nov 21 2018 at 17:03, on Zulip):

it's basically an operational reification of the principle that the lifetimes of references passed to a function outlive that function

RalfJ (Nov 21 2018 at 17:04, on Zulip):

but as the Arc example shows, there are use-cases for other functions in other threads to do stuff that makes my reference invalid

Gankro (Nov 21 2018 at 17:05, on Zulip):

Is Arc potentially out of scope since you're not "handling" concurrency?

Gankro (Nov 21 2018 at 17:05, on Zulip):

or alternatively, are atomic ops just special?

RalfJ (Nov 21 2018 at 17:05, on Zulip):

I wouldn't say so

RalfJ (Nov 21 2018 at 17:05, on Zulip):

I have not thought much about concurrency, but all atomic ops happen in non-frozen data

RalfJ (Nov 21 2018 at 17:06, on Zulip):

so they should all be essentially raw accesses with a Shr(None) tag

RalfJ (Nov 21 2018 at 17:06, on Zulip):

which, I think, should work just fine -- except for this detail about deallocation

RalfJ (Nov 21 2018 at 17:07, on Zulip):

and no matter what my model says, it is quite clear that Arc is in conflict with dereferencable as we send it to LLVM

RalfJ (Nov 21 2018 at 17:07, on Zulip):

Stacked Borrows can just tell us whether the fault is in Arc, or in rustc

Gankro (Nov 21 2018 at 17:09, on Zulip):

Is this only possible because ArcInner has an UnsafeCell?

RalfJ (Nov 21 2018 at 17:10, on Zulip):

yes

RalfJ (Nov 21 2018 at 17:10, on Zulip):

races are generally only possible on UnsafeCell

RalfJ (Nov 21 2018 at 17:10, on Zulip):

as they require shared accesses where at least one is a write

Gankro (Nov 21 2018 at 17:10, on Zulip):

ok now I see why you brought up this freeing closure example

RalfJ (Nov 21 2018 at 17:12, on Zulip):

:)

Gankro (Nov 21 2018 at 17:37, on Zulip):

I am increasingly convincing myself that &UnsafeCell should not be considered dereferenceable, as they are all candidates for atomic accesses, which seems fundamentally racey?

RalfJ (Nov 21 2018 at 17:41, on Zulip):

I've been back-and-forth between both possibilities

RalfJ (Nov 21 2018 at 17:41, on Zulip):

but it definitely seems footgun-y to make it dereferencable

Gankro (Nov 21 2018 at 17:44, on Zulip):

Also I will blindly assert optimizing UnsafeCell memory accesses is low impact?

Gankro (Nov 21 2018 at 17:48, on Zulip):

I do not like that I am leaning towards the freeing-closure being valid

RalfJ (Nov 21 2018 at 18:04, on Zulip):

Also I will blindly assert optimizing UnsafeCell memory accesses is low impact?

I guess we should measure that?

Gankro (Nov 21 2018 at 18:56, on Zulip):

I mean, intuitively either every access is carefully managed and so there's not much to do (Arc) or you escape to internal references ASAP (RefCell)

RalfJ (Nov 21 2018 at 19:05, on Zulip):

"internal references"?

Gankro (Nov 21 2018 at 19:10, on Zulip):

Like with RefCell/Mutex you make a guard and then promote that to a reference inside (and thus unaffected by) the UnsafeCell

RalfJ (Nov 21 2018 at 19:13, on Zulip):

ah I see. yes.

RalfJ (Nov 21 2018 at 22:08, on Zulip):

For the UnsafeCell example, I definitely had a mental model that function arguments had a phantom use at the start and end of the function

btw, this immediately implies that BTreeMap as of now is incorrect, because that phantom use is out-of-bounds if the reference points to an allocation that is "too short" compared to its type ;)

rkruppe (Nov 22 2018 at 10:49, on Zulip):

@Gankro

I mean, intuitively either every access is carefully managed and so there's not much to do (Arc) or you escape to internal references ASAP (RefCell)

This sorta makes sense for Arc and RefCell, but UnsafeCell also powers Cell, which is supposed to be a cheap way to get shared mutable memory -- much like any T & in C++, for example. I can easily see dereferenceable mattering for that kind of usage. I am actually coming around to UnsafeCell allowing deallocation under one's feet (Clang's usage of dereferenceable is unsound for the same reasons) but it might very well result in a measurable loss of optimization power.

RalfJ (Nov 22 2018 at 12:01, on Zulip):

Ideally Cell could opt back into that, but that seems hard to do practically

RalfJ (Nov 22 2018 at 12:01, on Zulip):

FWIW, all the complications around barriers for UnsafeCell do not apply to Cell because one cannot get a reference into a Cell

Nicole Mazzuca (Nov 22 2018 at 16:25, on Zulip):

wait, question - why can't we mark it as dereferenceable, given that C++ has the same issue?

Nicole Mazzuca (Nov 22 2018 at 16:25, on Zulip):

(which would imply it's not UB to deallocate from under the feet of a reference)

Nicole Mazzuca (Nov 22 2018 at 16:25, on Zulip):

unless this is a known bug in LLVM

RalfJ (Nov 22 2018 at 16:30, on Zulip):

unless this is a known bug in LLVM

it is

RalfJ (Nov 22 2018 at 16:30, on Zulip):

see https://lists.llvm.org/pipermail/llvm-dev/2018-July/124555.html

Nicole Mazzuca (Nov 22 2018 at 19:28, on Zulip):

mmh

Nicole Mazzuca (Nov 22 2018 at 19:28, on Zulip):

does this cause an issue in practice?

Nicole Mazzuca (Nov 22 2018 at 19:29, on Zulip):

I'd be tempted to use it as long as it doesn't, since if it does, then C++ also becomes wrong

Nicole Mazzuca (Nov 22 2018 at 19:29, on Zulip):

and we can yell at LLVM

RalfJ (Nov 22 2018 at 19:34, on Zulip):

well I'd say we should first decide what guarantee we want -- we could also declare Arc::drop to be wrong

RalfJ (Nov 22 2018 at 19:35, on Zulip):

if we decide that Arc::drop is right and rustc is wrong, we should at least open an I-unsound issue to track this

RalfJ (Nov 22 2018 at 19:37, on Zulip):

(a bit like https://github.com/rust-lang/rust/issues/28728: fix required but blocked on someone getting LLVM in shape)

Last update: Nov 20 2019 at 11:30UTC