Stream: project-ffi-unwind

Topic: PR #86155


view this post on Zulip nikomatsakis (Jul 30 2021 at 13:32):

So--- I finally got around to reading #86155.

view this post on Zulip nikomatsakis (Jul 30 2021 at 13:33):

It looks really great.

view this post on Zulip nikomatsakis (Jul 30 2021 at 13:33):

I'm trying to figure out, @RalfJ, if your concerns have been addressed

view this post on Zulip nikomatsakis (Jul 30 2021 at 13:35):

As far as I can tell, the new pass:

So the semantics of a call for miri are:

If you ask for the MIR immediately post build, we'll have plenty of terminators that indicate potential unwinding which will be optimized away later, but it's not really a "change in the semantics of MIR".

Does that sound right?

view this post on Zulip nikomatsakis (Jul 30 2021 at 13:36):

(cc @Alex Crichton)

view this post on Zulip Alex Crichton (Jul 30 2021 at 14:03):

I think so yeah, sounds right to me!

view this post on Zulip nikomatsakis (Jul 30 2021 at 15:21):

I'm going to r+

view this post on Zulip nikomatsakis (Jul 30 2021 at 15:21):

not a 1-way door :)

view this post on Zulip nikomatsakis (Jul 30 2021 at 15:21):

needs rebase, though

view this post on Zulip Alex Crichton (Jul 30 2021 at 15:24):

oh oops done now

view this post on Zulip RalfJ (Jul 30 2021 at 17:34):

If you ask for the MIR immediately post build, we'll have plenty of terminators that indicate potential unwinding which will be optimized away later, but it's not really a "change in the semantics of MIR".

I dont follow. I think there is a shift in the sematnics of MIR: when a "function that shouldnt be able to unwind" contains a terminator (from a fn call) that can unwind, this is UB at codegen time. But right after MIR building, this actually occurs in safe code. Ergo there must be a dialect change somewhere.

view this post on Zulip RalfJ (Jul 30 2021 at 17:35):

This is not blocking the PR (in particular now that this shift happens very early in the pipeline), but needs to be adressed carefully if/when we ever precisely document MIR semantics. Cc https://github.com/rust-lang/rust/issues/86299

view this post on Zulip nikomatsakis (Jul 30 2021 at 18:23):

RalfJ said:

I dont follow. I think there is a shift in the sematnics of MIR: when a "function that shouldnt be able to unwind" contains a terminator (from a fn call) that can unwind, this is UB at codegen time. But right after MIR building, this actually occurs in safe code. Ergo there must be a dialect change somewhere.

Hmm, I don't think it's UB at any point in time, unless that function actually unwinds. What am I missing?

view this post on Zulip nikomatsakis (Jul 30 2021 at 18:24):

I'm not sure I know what "UB at codegen time" even means, actually

view this post on Zulip nikomatsakis (Jul 30 2021 at 18:25):

regardless I agree we should note it in #86299 or some other tracking issue

view this post on Zulip Connor Horman (Jul 30 2021 at 20:41):

nikomatsakis said:

I'm not sure I know what "UB at codegen time" even means, actually

Sounds like it would be equivalent to C++'s IFNDR - The program is ill-formed, and the implementation could diagnose it and terminate compilation (or diagnose it for fun and keep compiling anyways), but it doesn't have to, and the resulting program would be free from the as-if rule period.

view this post on Zulip RalfJ (Jul 31 2021 at 09:03):

nikomatsakis said:

I'm not sure I know what "UB at codegen time" even means, actually

sorry, I meant it is UB in the operational semantics that describe program behavior at codegen time (in the dialect that MIR is written in at codegen time)

view this post on Zulip RalfJ (Jul 31 2021 at 09:03):

as in, if its not UB, codegen is just wrong

view this post on Zulip RalfJ (Jul 31 2021 at 09:07):

I am talking about code like this (assume panic=abort):

extern "C+unwind" { fn can_unwind() }

fn can_not_unwind() {
  bb0:
    Call can_unwind() [ ret -> bb1 ] // implicit: unwinding is propagated

  bb1:
    Return
}

When we have this MIR at codegen time, we generate a nounwind attribute for can_not_unwind. so if can_unwind unwinds, we have UB.

However, we generate exactly that MIR from safe code. So in the MIR we initially generate, this cannot be UB.

Ergo the rules for UB changed some time in between.

view this post on Zulip BatmanAoD (Kyle Strand) (Aug 05 2021 at 00:01):

Apologies for asking a question with vast ignorance: in that code, where does the landing pad come into play? can_not_unwind really can't undwind, and if can_unwind does unwind, a landing pad in can_not_unwind needs to abort the process.

view this post on Zulip RalfJ (Aug 06 2021 at 12:53):

and if can_unwind does unwind, a landing pad in can_not_unwind needs to abort the process.

yes it is my understanding that @Alex Crichton's PR adds a pass that will add such a landing pad

view this post on Zulip RalfJ (Aug 06 2021 at 12:54):

so after that pass we know that no function called from a nounwind function can unwind, so we can change the MIR dialect to one where doing so would be UB. this, in turn, allows codegen to emit the code that it does for LLVM (where we would have such UB).


Last updated: Jan 26 2022 at 09:02 UTC