Stream: wg-async-foundations

Topic: drop unwind soundness


tmandry (May 13 2019 at 22:56, on Zulip):

Hey @RalfJ, I have a UB question in regard to the optimization for #52924

tmandry (May 13 2019 at 22:56, on Zulip):

Not sure if you're the one to ask, but you seem most likely to know :)

tmandry (May 13 2019 at 22:57, on Zulip):

If a drop panics, is it safe to assume that the value is StorageDead?

tmandry (May 13 2019 at 22:57, on Zulip):

or could an unwind handler actually inspect the value?

tmandry (May 13 2019 at 22:58, on Zulip):

I know we certainly cannot drop the value _again_ (see #14875)

tmandry (May 13 2019 at 22:59, on Zulip):

but that issue doesn't seem to say anything about anything else we could do with the value

Taylor Cramer (May 14 2019 at 00:31, on Zulip):

@tmandry well fwiw this works today, but MIRI complains because of the panic so I'm not sure that's enough to say anything: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=92f97259d8b64bc5cc0c3ec97076cab9

tmandry (May 14 2019 at 00:41, on Zulip):

hmm ok

tmandry (May 14 2019 at 00:41, on Zulip):

well I found a different way to avoid regressing the size of that one test case, so maybe it's not that important ;)

tmandry (May 14 2019 at 00:42, on Zulip):

but if we can assume it, it still makes some of the other futures even smaller

RalfJ (May 14 2019 at 07:16, on Zulip):

If a drop panics, is it safe to assume that the value is StorageDead?

StorageDead is for locals only, if the value was in a Box there's no such thing

RalfJ (May 14 2019 at 07:16, on Zulip):

I know we certainly cannot drop the value _again_ (see #14875)

you cannot drop it again in general, but specific code knowing the specific drop impl might do that

RalfJ (May 14 2019 at 07:17, on Zulip):

like, I can drop an i32 as often as I want, knowing that that is a NOP

RalfJ (May 14 2019 at 07:18, on Zulip):

but in general I think there are many open questions around drop and panics, also see https://github.com/rust-lang/rust/issues/60611

RalfJ (May 14 2019 at 07:19, on Zulip):

what is the optimization you have in mind?

tmandry (May 14 2019 at 17:07, on Zulip):

@RalfJ yes, I mean is it safe to assume the local itself is StorageDead

tmandry (May 14 2019 at 17:08, on Zulip):

To ask another way: Is it UB to inspect the contents of a local, after drop on that local panics?

Taylor Cramer (May 14 2019 at 17:23, on Zulip):

I guess TL;DR is also "is my code example above UB"

RalfJ (May 14 2019 at 17:26, on Zulip):

To ask another way: Is it UB to inspect the contents of a local, after drop on that local panics?

I don't think we can say that in general -- I am not sure through which mechanism this UB would arise.
I guess you also want writes to that memory to be UB, so just making it uninitialized is not sufficient for your purpose?

RalfJ (May 14 2019 at 17:27, on Zulip):

"killing the storage and invalidating all pointers" is what StorageDead is for.

RalfJ (May 14 2019 at 17:27, on Zulip):

why do you want to tie this analysis to drop instead of StorageDead?

tmandry (May 14 2019 at 17:30, on Zulip):

@RalfJ yeah, writes would also need to be UB

tmandry (May 14 2019 at 17:30, on Zulip):

mainly, I want it to simplify the MIR generation

RalfJ (May 14 2019 at 17:30, on Zulip):

you are exactly asking for StorageDead semantics then

RalfJ (May 14 2019 at 17:30, on Zulip):

I wouldn't know how to make drop have that -- in particular if the drop didnt even complete

Taylor Cramer (May 14 2019 at 17:31, on Zulip):

FWIW I also need a way to mark things storagedead in code manually without moving them-- TLDR i want to make drop more magic ;)

RalfJ (May 14 2019 at 17:31, on Zulip):

I don't like magic ;)

RalfJ (May 14 2019 at 17:32, on Zulip):

but I have actually brought up a non-moving version of mem::forget in a discussion once^^

tmandry (May 14 2019 at 17:32, on Zulip):

actually, I need to change the MIR generation regardless, and I'm not 100% sure what changes I'm allowed to make...

RalfJ (May 14 2019 at 17:32, on Zulip):

some kind of intrinsic

tmandry (May 14 2019 at 17:33, on Zulip):

in particular it isn't clear in which parts we decide not to generate StorageDead purely as an optimization, and which parts are due to semantics

RalfJ (May 14 2019 at 17:34, on Zulip):

I'm afraid I don't know either :/ as far as I am concerned, where we insert StorageDead is part of our spec -- we get to make that choice, and that defines which code has UB. but we better document our choice well.

tmandry (May 14 2019 at 17:34, on Zulip):

fair enough

Taylor Cramer (May 14 2019 at 17:35, on Zulip):

having it be part of our spec that the semantics are different for generators would probably be bad XD

tmandry (May 14 2019 at 17:35, on Zulip):

agreed :)

RalfJ (May 14 2019 at 17:35, on Zulip):

yeah sounds like it would be nice if functions and generators would be consistent about this :D

tmandry (May 14 2019 at 17:35, on Zulip):

I wouldn't know how to make drop have that

@RalfJ can you elaborate on your concern here?

RalfJ (May 14 2019 at 17:36, on Zulip):

it's less a concern and more literally "I dont know how to define a semantics that does this". drop is almost a normal function call for Miri, just invoked by a dedicated MIR terminator instead of Call.

RalfJ (May 14 2019 at 17:36, on Zulip):

there's no magic for drop and I quite like that :D keeps the spec simpler and easier to reason about (for compiler and unsafe code authors alike)

tmandry (May 14 2019 at 17:37, on Zulip):

hmm, yes, but eliding StorageDead in some cases certainly makes this harder to reason about

tmandry (May 14 2019 at 17:38, on Zulip):

..let me try and remember why this matters :)

RalfJ (May 14 2019 at 17:38, on Zulip):

I guess TL;DR is also "is my code example above UB"

I think if Miri supported panics it would reject this code based on Stacked Borrows considerations -- the call to drop involves a mutable borrow of the local, which kills all previously created pointers.

RalfJ (May 14 2019 at 17:39, on Zulip):

Whether that is a happy accident or something one can derive a guarantee from, I am not sure.^^

RalfJ (May 14 2019 at 17:40, on Zulip):

also the mutable borrow fordrop is a bit special, so it might or might not do what you want here

RalfJ (May 14 2019 at 17:40, on Zulip):

but we could reasonably define it to act like an &mut would

Taylor Cramer (May 14 2019 at 17:40, on Zulip):

That sounds like the guarantee I would want here

RalfJ (May 14 2019 at 17:41, on Zulip):

you can simulate this effect as follows: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=bb17a99761dc14aecb6aed29ff62c5bc

RalfJ (May 14 2019 at 17:41, on Zulip):

(where the let _val = &mut st; emulates the borrow that would happen for the drop)

tmandry (May 14 2019 at 17:42, on Zulip):

Right, so today if we drop a local and the drop panics, then along the cleanup path (where we continue dropping all other locals in scope) we never generate StorageDead for any local on that path

tmandry (May 14 2019 at 17:42, on Zulip):

and then we resume (the unwinding)

RalfJ (May 14 2019 at 17:43, on Zulip):

Right, so today if we drop a local and the drop panics, then along the cleanup path (where we continue dropping all other locals in scope) we never generate StorageDead for any local on that path

also doesn't the panic hook get called before we even get to the cleanup path?

RalfJ (May 14 2019 at 17:43, on Zulip):

not sure what the evaluation order is here

tmandry (May 14 2019 at 17:44, on Zulip):

hm, I would think so, yes

RalfJ (May 14 2019 at 17:44, on Zulip):

in that case a StorageDead on the cleanup path would be too late anyway

tmandry (May 14 2019 at 17:46, on Zulip):

Right. On second thought, for my use case, I only care about StorageDead for locations reachable within the function

tmandry (May 14 2019 at 17:48, on Zulip):

Right now, my analysis pass thinks that all these locals are still StorageLive along the cleanup path

tmandry (May 14 2019 at 17:50, on Zulip):

even for locals in disjoint scopes in the source code

tmandry (May 14 2019 at 17:50, on Zulip):

(this affects whether we can safely overlap them in the generator layout)

tmandry (May 14 2019 at 17:51, on Zulip):

So, for the purposes of my analysis, I think it should be valid to assume that once drop returns (along the return edge or the unwind edge), the local that was dropped is now StorageDead

tmandry (May 14 2019 at 17:52, on Zulip):

This actually doesn't preclude inspecting the local from within the panic hook

tmandry (May 14 2019 at 17:52, on Zulip):

(so, sorry for posing the wrong question :) )

tmandry (May 14 2019 at 17:54, on Zulip):

Equivalently (from my perspective), we can generate the StorageDead statements to reflect this

tmandry (May 14 2019 at 17:54, on Zulip):

It sounds like doing this would be better from miri's perspective

tmandry (May 14 2019 at 17:54, on Zulip):

But might impact performance :(

tmandry (May 14 2019 at 17:54, on Zulip):

Is this making sense @Taylor Cramer @RalfJ ?

Taylor Cramer (May 14 2019 at 17:56, on Zulip):

Yes, that sounds right to me.

RalfJ (May 14 2019 at 18:52, on Zulip):

So, for the purposes of my analysis, I think it should be valid to assume that once drop returns (along the return edge or the unwind edge), the local that was dropped is now StorageDead

do you mean "sufficient to assume"?

RalfJ (May 14 2019 at 18:55, on Zulip):

so, let me try to paraphrase: you think that it makes sense for our surface language spec to basically say that right after the drop comes back (on both the return and the unwind edge), there is a StorageDead. And that in turn you say is sufficient for your analysis to be as precise as you want it to be. You are fine with the storage hook inspecting stuff, which is actually allowed as that happens before the unwind edge is taken, i.e. before the StorageDead happens (ignoring Stacked Borrows concerns here). However, you are worried that actually materializing these StorageDead has a negative performance impact.
Is that accurate?

RalfJ (May 14 2019 at 18:57, on Zulip):

I agree that it seems reasonable to say that we have a StorageDead for locals that just got dropped. I don't know your analysis so I cannot say what this entails for you, but I view StorageDead as basically acting like free would for the heap -- after that, there's no way to ever access the memory that used to back this local again (and if StorageLive happens later, that's a new location, the old pointers remain dangling).
What is the performance impact you are worried about?

tmandry (May 14 2019 at 20:42, on Zulip):

@RalfJ yes, that sounds accurate

tmandry (May 14 2019 at 20:43, on Zulip):

@RalfJ materializing StorageDead after both edges of a drop sometimes means we need to keep around more basic blocks, in addition to more statements

tmandry (May 14 2019 at 20:46, on Zulip):

I haven't run the profiler with that change specifically, but when I made a change to add StorageDead for non-Drop locals along paths where they used to be elided, the performance hit was pretty bad

tmandry (May 14 2019 at 20:46, on Zulip):

(so we had to do it for generators only, to enable the optimization without sacrificing performance)

tmandry (May 14 2019 at 20:47, on Zulip):

anyway, I'm inferring that we'll see a similar hit if I materialize these extra StorageDeads, but I'm not 100% sure :)

tmandry (May 14 2019 at 20:56, on Zulip):

also, for those locals we had to emit StorageDrop because otherwise the information was just not in the MIR

tmandry (May 14 2019 at 20:56, on Zulip):

for locals which are drop we can infer it

tmandry (May 14 2019 at 20:56, on Zulip):

although it's quite messy conceptually to be handling these two differently

tmandry (May 14 2019 at 21:00, on Zulip):

there isn't anything too surprising about it, but it pushes the complexity down to the consumers of MIR

centril (May 14 2019 at 22:10, on Zulip):

If new guarantees are added, this should be summarized and raised with the lang team

tmandry (May 14 2019 at 22:23, on Zulip):

@centril it's a guarantee to the compiler, not people who are writing rust

tmandry (May 14 2019 at 22:23, on Zulip):

though I'm not sure if this distinction matters

centril (May 14 2019 at 22:23, on Zulip):

Probably not

centril (May 14 2019 at 22:24, on Zulip):

It would be good to phrase the guarantee succinctly in some issue and say why it is needed

centril (May 14 2019 at 22:24, on Zulip):

and what problems it may cause

tmandry (May 14 2019 at 22:24, on Zulip):

Sure, I'll open an issue

tmandry (May 14 2019 at 22:25, on Zulip):

I don't anticipate it being controversial, but it would be good to have it written down succinctly as you say

tmandry (May 14 2019 at 22:28, on Zulip):

This is basically documenting certain assumptions that the compiler already seems to make

tmandry (May 14 2019 at 22:31, on Zulip):

@centril The assumption is that once drop returns, whether it succeeds or panics, you can't access the local you tried to drop

tmandry (May 14 2019 at 22:32, on Zulip):

..and doing so would cause UB

tmandry (May 14 2019 at 22:32, on Zulip):

Is this something the lang team actually needs to discuss?

Zoxc (May 14 2019 at 22:34, on Zulip):

This only applies to the MIR drop terminator, not drops in general

tmandry (May 14 2019 at 22:37, on Zulip):

@Zoxc true, but don't most drops involve a MIR drop terminator?

tmandry (May 14 2019 at 23:00, on Zulip):

@centril I cc'd lang on #60840

Last update: Nov 18 2019 at 00:45UTC