Stream: t-lang/wg-unsafe-code-guidelines

Topic: non-local goto and Pin / https://github.com/rust-lang/rfc...


gnzlbg (Feb 18 2019 at 13:34, on Zulip):

@RalfJ I mentioned Pin in the last commit, but I guess my question was whether not jumping over Pin is a safety or validity invariant of Pin

gnzlbg (Feb 18 2019 at 13:35, on Zulip):

*jumping, and deallocating Pin's memory without dropping the Pin

gnzlbg (Feb 18 2019 at 13:37, on Zulip):

IIUC the issue is that safe Rust guarantees that freeing the memory of a Drop type will run its destructor

RalfJ (Feb 18 2019 at 13:38, on Zulip):

it's a safety thing

gnzlbg (Feb 18 2019 at 13:38, on Zulip):

an incorrect non-local jump can break this guarantee, and if it does, the behavior is undefined

gnzlbg (Feb 18 2019 at 13:39, on Zulip):

but that's the case independently of whether you jump over a Pin or some other type

RalfJ (Feb 18 2019 at 13:39, on Zulip):

IIUC the issue is that safe Rust guarantees that freeing the memory of a Drop type will run its destructor

well, not really. You can do e.g. fn free_box_without_drop(x: Box<T>) { mem::forget(*x) }

gnzlbg (Feb 18 2019 at 13:39, on Zulip):

the problem with Pin is that if drop is not called you can get an use-after-free right?

gnzlbg (Feb 18 2019 at 13:40, on Zulip):

e.g. if some future is running in an executor in a different thread, and tries to read the memory behind the Pin which was deallocated

RalfJ (Feb 18 2019 at 13:41, on Zulip):

uh, no. futures don't need this guarantee.

gnzlbg (Feb 18 2019 at 13:41, on Zulip):

well, not really. You can do e.g. fn free_box_without_drop(x: Box<T>) { mem::forget(*x) }

gnzlbg (Feb 18 2019 at 13:41, on Zulip):

so what's the fundamental thing that non-local goto's break that Pin relies on ?

RalfJ (Feb 18 2019 at 13:41, on Zulip):

the guarantee is relevant for intrusive collections

gnzlbg (Feb 18 2019 at 13:41, on Zulip):

and that my Pin-like library couldn't make use of ?

gnzlbg (Feb 18 2019 at 13:41, on Zulip):

(sorry i'm in an ICE and WLAN sucks)

RalfJ (Feb 18 2019 at 13:41, on Zulip):

so what's the fundamental thing that non-local goto's break that Pin relies on ?

https://www.ralfj.de/blog/2018/04/10/safe-intrusive-collections-with-pinning.html

RalfJ (Feb 18 2019 at 13:42, on Zulip):

and that my Pin-like library couldn't make use of ?

well I suppose now that we blessed Pin other libs can do similar things

RalfJ (Feb 18 2019 at 13:43, on Zulip):

but before Pin there just was no (blessed) type that relied on things like this

gnzlbg (Feb 18 2019 at 13:43, on Zulip):

But Pin's safety invariant is just a doc comment AFAICT

RalfJ (Feb 18 2019 at 13:43, on Zulip):

so what?

RalfJ (Feb 18 2019 at 13:43, on Zulip):

this is abozut the semantic meaning of types

RalfJ (Feb 18 2019 at 13:43, on Zulip):

the compiler doesn't understand these

RalfJ (Feb 18 2019 at 13:43, on Zulip):

but still they are relevant when building safe abstractions

gnzlbg (Feb 18 2019 at 13:44, on Zulip):

it assumes that safe Rust code can't do non-local jumps and that is still true

gnzlbg (Feb 18 2019 at 13:44, on Zulip):

sure, my point is just that Pin isn't special

RalfJ (Feb 18 2019 at 13:44, on Zulip):

it's just like leakpocalpyse. the safety invariants involved here were all just doc comments, and yet we got UB -- from compsotion of several libraries that were safe in isolation.

RalfJ (Feb 18 2019 at 13:45, on Zulip):

sure, my point is just that Pin isn't special

well, it is and it isn't. when two libraries are mutually incompatible, which one is special?

gnzlbg (Feb 18 2019 at 13:45, on Zulip):

a non-local jump over a Drop type is UB, independently of whether Pin exists or not

RalfJ (Feb 18 2019 at 13:45, on Zulip):

like, you could kick out Pin and have some kind of safe closure-based interface for setjmp/longjmp

RalfJ (Feb 18 2019 at 13:45, on Zulip):

no it isnt

RalfJ (Feb 18 2019 at 13:45, on Zulip):

that's just mem::forget of a stack frame, basically

RalfJ (Feb 18 2019 at 13:46, on Zulip):

the thing is that you cannot mem::forget the content of a Pin<Box<T>>

RalfJ (Feb 18 2019 at 13:46, on Zulip):

and you cannot mem::forget something that got pinned to the stack using pin-utils either

gnzlbg (Feb 18 2019 at 13:47, on Zulip):

ah because you can move out of the Box but you can't move out of the Pin<Box>

gnzlbg (Feb 18 2019 at 13:48, on Zulip):

but you can do that with unsafe code, and the non-local jump allows you to do that with unsafe code as well

RalfJ (Feb 18 2019 at 13:48, on Zulip):

so in a sense, Pin is special -- just as special as Rc

RalfJ (Feb 18 2019 at 13:48, on Zulip):

they both are blessed safe APIs that restrict what other safe APIs may do

RalfJ (Feb 18 2019 at 13:48, on Zulip):

even though the compiler knows nothing about either of them

RalfJ (Feb 18 2019 at 13:49, on Zulip):

they both are blessed safe APIs that restrict what other safe APIs may do

sorry, "do" was the wrong term here.
they restrict what other safe APIs may assume about unknown safe code.

RalfJ (Feb 18 2019 at 13:49, on Zulip):

and that is the key point

gnzlbg (Feb 18 2019 at 13:49, on Zulip):

so the question is whether you can create a safe wrapper over non-local jumps that works even if Pin exists?

RalfJ (Feb 18 2019 at 13:49, on Zulip):

I don't think you can, as you have to rule out stack-pinned variables

RalfJ (Feb 18 2019 at 13:49, on Zulip):

on the stack frames you are killing

RalfJ (Feb 18 2019 at 13:50, on Zulip):

but if e.g. we decided that stack pinning wasnt possible, we could have a safe wrapper for such non-local jumps

gnzlbg (Feb 18 2019 at 13:50, on Zulip):

yeah, a stack frame with pinned variables is "pinned" as well, in some sense

RalfJ (Feb 18 2019 at 13:50, on Zulip):

exactly

gnzlbg (Feb 18 2019 at 13:50, on Zulip):

there are many other issues that make writing a safe wrapper for such non-local jumps probably impossible

gnzlbg (Feb 18 2019 at 13:51, on Zulip):

so even if Pin would not exist, that would at least be very hard

RalfJ (Feb 18 2019 at 13:51, on Zulip):

are you sure? the rest you mention seem solvable to me

RalfJ (Feb 18 2019 at 13:51, on Zulip):

with a closure-based API in the style of call/cc + lifetimes to control scopding

gnzlbg (Feb 18 2019 at 13:51, on Zulip):

you can't allow non-volatile reads and stores between the setjmp and longjmp, you can't call longjump from a different thread

RalfJ (Feb 18 2019 at 13:51, on Zulip):

and non-Send types to make sure the "handle" stays in the same thread

RalfJ (Feb 18 2019 at 13:51, on Zulip):

ah right you mentioned those stores, I dont understand why they are necessary

gnzlbg (Feb 18 2019 at 13:52, on Zulip):

that would mean that you can't create new send types within the closure

RalfJ (Feb 18 2019 at 13:52, on Zulip):

nono the handle gets passed in as an extra argument

RalfJ (Feb 18 2019 at 13:52, on Zulip):

just make that non-Send

gnzlbg (Feb 18 2019 at 13:53, on Zulip):

ah just like crossbeam scope ?

RalfJ (Feb 18 2019 at 13:53, on Zulip):

yeah sth like that

gnzlbg (Feb 18 2019 at 13:53, on Zulip):

if you can only longjmp back from the handle, then that problem is solvable

RalfJ (Feb 18 2019 at 13:54, on Zulip):

so why do all laods/stores need to be volatile? that's weird

gnzlbg (Feb 18 2019 at 13:54, on Zulip):

basically because the C standard says so and that's what LLVM implements

RalfJ (Feb 18 2019 at 13:54, on Zulip):

I'd intuitively (without any idea what I am talking about :P ) expect that compiler barriers around setjmp and longjmp should suffice

RalfJ (Feb 18 2019 at 13:55, on Zulip):

well there's probably a reason they are saying this, some transformations that would otherwise break stuff or so

gnzlbg (Feb 18 2019 at 13:55, on Zulip):

#[returns_twice] elides certain optimizations, but not all of them

gnzlbg (Feb 18 2019 at 13:55, on Zulip):

i have one example, sec:

gnzlbg (Feb 18 2019 at 13:57, on Zulip):

This is UB even if setjmp is returns_twice:

unsafe fn foo() -> i32 {
    let mut buf: jmp_buf = [0; 8];
    let mut x = 42;
    if setjmp(&mut buf) != 0 {  // Step 0: setjmp returns 0
        // Step 3: when setjmp returns 1 x has always been
        // modified to be  == 13 so this should always return 13:
        return x;
    }
    x = 13; // Step 1: x is modified
    longjmp(&mut buf, 1); // Step 2: jumps to Step 0 returning 1
    x // this will never be reached
}
gnzlbg (Feb 18 2019 at 14:00, on Zulip):

That's because: http://port70.net/~nsz/c/c11/n1570.html#7.13.2.1p3

gnzlbg (Feb 18 2019 at 14:03, on Zulip):

Basically, longjmp states that, because x was modified between the setjmp and the longjmp, the behavior of performing a non-volatile read of it is undefined.

gnzlbg (Feb 18 2019 at 14:07, on Zulip):

We could allow that in Rust, but the API docs of the C FFI function would still say that this is UB.

RalfJ (Feb 18 2019 at 14:10, on Zulip):

so the problem is the compiler will move the store to after the longjmp?

RalfJ (Feb 18 2019 at 14:11, on Zulip):

hm, I see... compiler barriers don't help because of the "did the address escape" analysis compilers are doing.

RalfJ (Feb 18 2019 at 14:11, on Zulip):

that analysis is cause so much trouble :(

gnzlbg (Feb 18 2019 at 14:11, on Zulip):

the problem is that the compiler might const propagate x = 42 to return x because it assumes that if some code changed it, the user needs to perform a volatile read, but it is not doing that here

gnzlbg (Feb 18 2019 at 14:12, on Zulip):

basically, LLVM returns_twice maps to C's semantics, where if a volatile read is not used, LLVM assumes that the value was not modified between the setjmp and the longjmp.

gnzlbg (Feb 18 2019 at 14:13, on Zulip):

the LLVM returns_twice attribute disables other optimizations, but not this one

gnzlbg (Feb 18 2019 at 14:13, on Zulip):

since for C, that optimization is correct

gnzlbg (Feb 18 2019 at 14:14, on Zulip):

(if the compiler cannot prove that the value was not modified, it would need to insert a volatile load)

nagisa (Feb 18 2019 at 14:56, on Zulip):

setjmp/longjmp = unsafe & problems solved.

nagisa (Feb 18 2019 at 14:56, on Zulip):

:slight_smile:

gnzlbg (Feb 18 2019 at 15:38, on Zulip):

@RalfJ so I was trying to add a summary comment of the discussion before modifying the RFC, and I am not sure I fully understood: fn free_box_without_drop(x: Box<T>) { mem::forget(*x) }. That function moves the content of the box out of the box into stack storage, then drops the empty Box, right? That's not the exact same thing as freeing the box without dropping its contents and without moving its contents into the stack.

gnzlbg (Feb 18 2019 at 15:39, on Zulip):

Is it possible to free the box memory without dropping its contents and without using extra stack memory in safe Rust? (that would be more similar to what non-local goto does). I am not sure if the difference matters though.

RalfJ (Feb 18 2019 at 15:41, on Zulip):

@RalfJ so I was trying to add a summary comment of the discussion before modifying the RFC, and I am not sure I fully understood: fn free_box_without_drop(x: Box<T>) { mem::forget(*x) }. That function moves the content of the box out of the box into stack storage, then drops the empty Box, right? That's not the exact same thing as freeing the box without dropping its contents and without moving its contents into the stack.

the memcpy of the contents to the stack is not observable (ignoring stack overflow^^)

RalfJ (Feb 18 2019 at 15:41, on Zulip):

so I'd argue this is exactly dropping the Box without dropping its contents

RalfJ (Feb 18 2019 at 15:41, on Zulip):

actually I'd expect LLVM will optimize away the useless memcpy

gnzlbg (Feb 18 2019 at 15:44, on Zulip):

@RalfJ i think the issue with the longjmp is that it takes ownership of all the stack frames

gnzlbg (Feb 18 2019 at 15:45, on Zulip):

moves them out, and frees them

RalfJ (Feb 18 2019 at 15:46, on Zulip):

there's many ways to describe the same issue ;)

gnzlbg (Feb 18 2019 at 15:46, on Zulip):

but you can't move a Pin (that's the whole point), so the problem with longjmp is a move of a !Move type, or however we would express Pin in the type system if we could

gnzlbg (Feb 18 2019 at 15:48, on Zulip):

that is, even though Rust does not have Move (or !Move), by blessing Pin, we have actually added them to the language

gnzlbg (Feb 18 2019 at 15:56, on Zulip):

So this is what I've wrote as summarizing comment. I don't know if there is a more succinct or clear way to put this.


The RFC is incorrect, because deallocating memory without calling destructors is ok in safe Rust, e.g., fn free_box_without_drop(x: Box<T>) { mem::forget(*x) }.

Notice how, to deallocate the Box without dropping its contents, you have to move the contents of the box out first.

The issue triggered by non-local goto's and Pin, is deallocating pinned memory. Doing that would require moving the object out of the Pin first, forgetting it to avoid running destructors, and then deallocating the memory, but one cannot move out of a Pin.

So what non-local goto can introduce, is a move out of a "!Move" type, which violates the type-system invariants, and is undefined behavior.

RalfJ (Feb 18 2019 at 19:04, on Zulip):

that's not wrong, but IMO you are mixing up cause an effect here

RalfJ (Feb 18 2019 at 19:05, on Zulip):

really what we have added to the language by blessing Pin is that you cant just deallocate stuff, even if you are never going to look at it again

RalfJ (Feb 18 2019 at 19:05, on Zulip):

in the general case, you might have to "ask for permission"

RalfJ (Feb 18 2019 at 19:05, on Zulip):

I wouldn't call that !Move

RalfJ (Feb 18 2019 at 19:06, on Zulip):

but it does have somewhat similar effects, I agree with that

gnzlbg (May 08 2019 at 13:59, on Zulip):

@Nemo157 pointed out that deallocating a Pin without calling its destructor is not unsafe, because a Pin can only contain a reference to the value being allocated, but it cannot contain the value itself in line. So if the Pin is freed, that's ok, because no other code can have references to that Pin, only to the value it refers to, and that value would have been leaked.

gnzlbg (May 08 2019 at 14:00, on Zulip):

e.g. if you have a Pin<Box<T>> jumping over the pin frees the Box memory, but does not free the memory the Box refers to

gnzlbg (May 08 2019 at 14:03, on Zulip):

the problem would be to have a that Pin<&mut T> refers to a T, where the T is freed without calling its destructor (e.g. because the T is jumped over)

RalfJ (May 08 2019 at 14:26, on Zulip):

e.g. if you have a Pin<Box<T>> jumping over the pin frees the Box memory, but does not free the memory the Box refers to

why would it free the Box memory? the destructor for all of this never gets called.

gnzlbg (May 08 2019 at 14:28, on Zulip):

i mean the memory of the Box value itself, not the memory it manages

RalfJ (May 08 2019 at 14:46, on Zulip):

but then where is the problem? that's the same as doing mem::forget on a Pin<Box<T>>

gnzlbg (May 08 2019 at 14:51, on Zulip):

no problem, just clarifying that the reason stack-pinning is unsafe with setjmp, is not because one can deallocate Pin<&mut T> without calling Pin's destructor, but because that &mut T can refer to the stack, and that T can be deallocated without calling its destructor

RalfJ (May 08 2019 at 14:53, on Zulip):

that is accurate

gnzlbg (May 08 2019 at 14:55, on Zulip):

i don't know if deallocating the storage for that T "moves it", which is what the Pin docs say should not happen, but it certainly invalidates any references to the type

gnzlbg (May 08 2019 at 14:56, on Zulip):

which is what the Pin docs might want to, at least, note

gnzlbg (May 08 2019 at 14:56, on Zulip):

https://doc.rust-lang.org/nightly/std/pin/struct.Pin.html

RalfJ (May 08 2019 at 14:57, on Zulip):

@gnzlbg https://doc.rust-lang.org/nightly/std/pin/index.html#drop-guarantee

RalfJ (May 08 2019 at 14:57, on Zulip):

is there it talks about deallocating the storage

gnzlbg (May 08 2019 at 14:57, on Zulip):

lol, i've never seen that page

gnzlbg (May 08 2019 at 14:57, on Zulip):

I've always just looked at pin::Pin haha

RalfJ (May 08 2019 at 14:58, on Zulip):

you have seen that the only thing it says there is "go to module docs please"?^^

RalfJ (May 08 2019 at 14:58, on Zulip):

as someone who spent lots of time writing that particular doc, I'd appreciate feedback for making people actually go there ;)

gnzlbg (May 08 2019 at 15:02, on Zulip):

Make it bold

RalfJ (May 08 2019 at 15:02, on Zulip):

and 20pt? :P

gnzlbg (May 08 2019 at 15:02, on Zulip):

I saw that, but I supposed that was just to show which other types are in the module or something

gnzlbg (May 08 2019 at 15:02, on Zulip):

i was totally wrong

gnzlbg (May 08 2019 at 15:03, on Zulip):

well played

RalfJ (May 08 2019 at 15:03, on Zulip):

there's precedent for making it italic though: https://doc.rust-lang.org/nightly/std/primitive.i32.html

gnzlbg (May 08 2019 at 15:23, on Zulip):

Maybe one could add a **Note**: the documentation for this types lives in the module documentation and not here or something like that, that makes it more clear that this page is not what the type documentation is intended to be

RalfJ (May 08 2019 at 15:26, on Zulip):

I disagree with the statement that this is how how it is "intended to be"

RalfJ (May 08 2019 at 15:26, on Zulip):

pinning-the-concept != pin-the-type

RalfJ (May 08 2019 at 15:27, on Zulip):

so module-level docs seem like the right place to me. that's why I put it there ;)

gnzlbg (May 08 2019 at 15:28, on Zulip):

i mean, the type documentation is intended to just be the "brief" comment that IDEs show (the first line), the actual documentation (about how to use the type, or the context in which the type is intended to be used) lives in the module

Last update: Nov 20 2019 at 13:05UTC