Stream: project-ffi-unwind

Topic: implementing "abort on FFI call"


nikomatsakis (Nov 13 2019 at 15:21, on Zulip):

So I was thinking about how to measure the impact of a proposal where the C ABI permits unwinding but -- if we are -Cpanic=abort -- we abort if unwinding actually occurs (versus UB).

I think the way to do this would be to modify the no_landing_pads transformation. Right now, it just removes all landing pads. I think what we would actually want is to check if this is a function call and -- in the case where it is -- inspect the ABI and callee. If the ABI is not Rust (or Rust call), and the callee is not known, then we would instead rewrite the unwind to branch to a block with TerminatorKind::Abort.

We might or might not have such a block today, so we can (lazilly, and at most once) create such a block.

nikomatsakis (Nov 13 2019 at 15:21, on Zulip):

There might be some other assertions that need to be adjusted.

nikomatsakis (Nov 13 2019 at 15:43, on Zulip):

I put in some work towards implementing this in the abort-on-ffi-call branch -- not building yet or quite complete, but almost there (I think..?)

nikomatsakis (Nov 13 2019 at 16:19, on Zulip):

OK I think this is building now, but I really have to stop and do other stuff

nikomatsakis (Nov 13 2019 at 16:45, on Zulip):

(OK, it builds, but it doesn't work; I don't quite know what's going on, something to do with the remove_noop_landing_pads pass...)

nikomatsakis (Nov 13 2019 at 16:46, on Zulip):

oh, it's cause the landing pad has nothing to do I guess

nikomatsakis (Nov 14 2019 at 01:47, on Zulip):

ok, I think that is working now -- the MIR looks right, but it's sort of hard to test. I tried hacking up some C++ function that throws even though it's not supposed to, but C++ keeps calling std::terminate since it can't find a catch handler (true even when I put the Rust function sandwiched between two C++ functions, and one of them has a catch)

Kyle Strand (Nov 14 2019 at 03:39, on Zulip):

Wow, the Rust sandwich breaks catch? I did not expect that at all.

nikomatsakis (Nov 14 2019 at 10:34, on Zulip):

I can only presume it has to do with interrupting the ability to walk up the stack and find the catch handler somehow? This is true on linux, at least.

nikomatsakis (Nov 14 2019 at 10:35, on Zulip):

Well hey, it does abort... :)

Amanieu (Nov 14 2019 at 13:48, on Zulip):

@nikomatsakis What does the LLVM IR look like?

gnzlbg (Nov 14 2019 at 13:49, on Zulip):

nikomatsakis 2:47 AM
ok, I think that is working now -- the MIR looks right, but it's sort of hard to test. I tried hacking up some C++ function that throws even though it's not supposed to, but C++ keeps calling std::terminate since it can't find a catch handler (true even when I put the Rust function sandwiched between two C++ functions, and one of them has a catch)

noexcept C++ functions have an "abort-on-panic" shim that calls std::terminate

gnzlbg (Nov 14 2019 at 13:50, on Zulip):

@nikomatsakis one needs to create a normal C++ function that throws, and then just use extern "C" { ... }

nikomatsakis (Nov 14 2019 at 13:51, on Zulip):

@gnzlbg I did not declare anything noexcept

nikomatsakis (Nov 14 2019 at 13:51, on Zulip):

I did create a normal C++ function

nikomatsakis (Nov 14 2019 at 13:51, on Zulip):

I'll push the project

nikomatsakis (Nov 14 2019 at 13:51, on Zulip):

@Amanieu I'm not sure :) I have to look how to save the LLVM IR, I forget

gnzlbg (Nov 14 2019 at 13:51, on Zulip):

I'll take a look

nikomatsakis (Nov 14 2019 at 13:56, on Zulip):

@gnzlbg I pushed my example to this branch -- it was the rust-to-cpp and cpp-to-rust (sandwich)

nikomatsakis (Nov 14 2019 at 13:57, on Zulip):

in both cases, I get the same error

nikomatsakis (Nov 14 2019 at 13:57, on Zulip):

looking at the c++ referenece manual, it suggested that if there is no catch block then C++ will terminate (which fits what I'm seeing)

nikomatsakis (Nov 14 2019 at 13:57, on Zulip):

it works without Rust in between

nikomatsakis (Nov 14 2019 at 13:57, on Zulip):

(that is, the throw gets caught)

nikomatsakis (Nov 14 2019 at 13:59, on Zulip):

@Amanieu I think the llvm output is in this gist

nikomatsakis (Nov 14 2019 at 13:59, on Zulip):

I guess it has nounwind and other stuff

nikomatsakis (Nov 14 2019 at 13:59, on Zulip):

oh wait

nikomatsakis (Nov 14 2019 at 13:59, on Zulip):

that wasn't built w/ my branch

nikomatsakis (Nov 14 2019 at 14:00, on Zulip):

llvm output from my branch

nikomatsakis (Nov 14 2019 at 14:02, on Zulip):

(this is a debug build)

Amanieu (Nov 14 2019 at 14:04, on Zulip):

@nikomatsakis Your function needs the uwtable attribute, otherwise dwarf tables aren't generated and the unwinder doesn't know how to unwind past your function to find the actual catch block in the caller.

nikomatsakis (Nov 14 2019 at 14:05, on Zulip):

@Amanieu ok -- would it make sense to have that only for the "leaf functions" that actually invoke FFI?

nikomatsakis (Nov 14 2019 at 14:05, on Zulip):

(after all, I'm going to catch any unwinding and trap)

Amanieu (Nov 14 2019 at 14:06, on Zulip):

Also in optimized builds the landing pad is probably going to be optimized away by LLVM as unreachable since triple_input is nounwind.

nikomatsakis (Nov 14 2019 at 14:07, on Zulip):

oh yeah, I need to supress that I guess

Amanieu (Nov 14 2019 at 14:07, on Zulip):

I think that just having no uwtable is good enough in practice to abort. The shim should only be needed on targets that require uwtable to be present for other reasons

Amanieu (Nov 14 2019 at 14:08, on Zulip):

Hmm, actually this only works for targets that use dwarf unwinding. SjLj unwinding doesn't use unwinding tables and always works. And I have no idea wtf SEH will do here.

Amanieu (Nov 14 2019 at 14:10, on Zulip):

The good news is, the overhead for abort on extern "C" unwinding is going to be 0 in most cases since it already aborts :)

gnzlbg (Nov 14 2019 at 14:13, on Zulip):

looking at the c++ referenece manual, it suggested that if there is no catch block then C++ will terminate (which fits what I'm seeing)

This makes sense. When unwinding hits a thread boundary, C++ calls std::terminate.

nikomatsakis (Nov 14 2019 at 14:13, on Zulip):

Hmm, actually this only works for targets that use dwarf unwinding. SjLj unwinding doesn't use unwinding tables and always works. And I have no idea wtf SEH will do here.

yeah I have been meaning to try this on windows -- I'm actually running windows, so that should be easy enough, except I've never tried to build Rust outside of WSL :laughter_tears:

nikomatsakis (Nov 14 2019 at 14:14, on Zulip):

I think that just having no uwtable is good enough in practice to abort. The shim should only be needed on targets that require uwtable to be present for other reasons

I don't quite follow-- I think you're saying the current branch status itself is fine, since either (a) the c++ code needs uwtable and it aborts when it doesn't find it (as today) or (b) some other environments may not need uwtable, in which case they will get trapped by the unwinding?

nikomatsakis (Nov 14 2019 at 14:15, on Zulip):

anyway i'd like to see the tra in action so I'll try to add the various attributes (and remove "nounwind") later on

Amanieu (Nov 14 2019 at 14:18, on Zulip):

Windows and Android have the requires_uwtable target attribute set, which means that unwinding tables are always generated even with panic=abort. For those platforms we will need to perform extra work to abort on unwind.

nikomatsakis (Dec 13 2019 at 00:10, on Zulip):

Hey @Amanieu I made some more changes here, but i'm still not able to get a C++ exception to actually propagate through Rust code. Here is the LLVM IR. I still can't get a throw 22 to propagate to Rust, I just see:

terminate called after throwing an instance of 'int'

Any idea what's going wrong?

In any case, I'm not 100% sure what we should be doing here. For example, generating uwtable attributes just so that we can abort later seems a bit silly, but I figured I'd like to see it die in a controlled fashion first.

Amanieu (Dec 13 2019 at 00:13, on Zulip):

I think what's happening here is due to 2 phase unwinding

Amanieu (Dec 13 2019 at 00:13, on Zulip):

The 1st phases searches for a suitable catch, while the second phase unwinds until that catch and calls cleanups.

Amanieu (Dec 13 2019 at 00:14, on Zulip):

However if there is no catch anywhere on the call chain then phase 1 fails and terminates.

Amanieu (Dec 13 2019 at 00:14, on Zulip):

Your landingpad is defined as a cleanup, which means that it is only executed in phase 2.

Amanieu (Dec 13 2019 at 00:15, on Zulip):

@nikomatsakis Could you point me to the Rust code for that again?

Amanieu (Dec 13 2019 at 00:15, on Zulip):

Ah nevermind, found it.

Amanieu (Dec 13 2019 at 00:17, on Zulip):

Yea, try defining the landingpad as a catch instead of a cleanup and see if that helps.

Amanieu (Dec 13 2019 at 02:02, on Zulip):

Actually I've just thought of another potential issue: even with panic=abort, we need to make sure longjmp works correctly on Windows. This means that we must allow foreign exceptions to unwind through Rust code even with panic=abort.

Kyle Strand (Dec 13 2019 at 02:04, on Zulip):

Re: that last, @Alex Crichton says that already works; I can point you to the PR that fixed it

Kyle Strand (Dec 13 2019 at 02:06, on Zulip):

Here: https://github.com/rust-lang/rust/pull/48572

Amanieu (Dec 13 2019 at 02:07, on Zulip):

@Kyle Strand scroll up to the top of this thread for the context.

Amanieu (Dec 13 2019 at 02:08, on Zulip):

So I was thinking about how to measure the impact of a proposal where the C ABI permits unwinding but -- if we are -Cpanic=abort -- we abort if unwinding actually occurs (versus UB).

Amanieu (Dec 13 2019 at 02:11, on Zulip):

@Kyle Strand That PR on aborts if a Rust panic tries to unwind past #[unwind(aborts)], but it lets foreign exceptions pass through. However if a foreign exception does unwind into code compiled with panic=aborts then we've still got UB.

Kyle Strand (Dec 13 2019 at 02:13, on Zulip):

Are you saying that the landing pad generated by unwind(aborts) is different from the one generated by -Cpanic=abort?

Kyle Strand (Dec 13 2019 at 02:14, on Zulip):

Or, oh, there's no landing pads. My bad.

Kyle Strand (Dec 13 2019 at 02:14, on Zulip):

So we'd need to generate them but only if the function directly calls an extern function

Amanieu (Dec 13 2019 at 02:19, on Zulip):

Except that we want to allow longjmp over Rust code when there are no destructors on the stack, since the current consensus is that is not UB.

Amanieu (Dec 13 2019 at 02:20, on Zulip):

That should be allowed even with panic=abort

Kyle Strand (Dec 13 2019 at 03:05, on Zulip):

Right; why is that UB?

Kyle Strand (Dec 13 2019 at 03:05, on Zulip):

(currently)

Kyle Strand (Dec 13 2019 at 03:12, on Zulip):

What effect does LLVM nounwind have in MSVC, anyway?

gnzlbg (Dec 13 2019 at 09:21, on Zulip):

nounwind does not mean "does not unwind", but rather "does not unwind in certain ways"

gnzlbg (Dec 13 2019 at 09:21, on Zulip):

nounwind functions can unwind with asynchronous exceptions on windows just fine (see the LLVM LangRef)

gnzlbg (Dec 13 2019 at 09:22, on Zulip):

on Linux, nounwind functions can unwind with forced unwinds just fine as well

gnzlbg (Dec 13 2019 at 09:22, on Zulip):

(e.g. like the ones used by longjmp, or pthread_cancel)

gnzlbg (Dec 13 2019 at 09:24, on Zulip):

Right; why is that UB?

Because there is no RFC guaranteeing that it isn't. I wrote one with Kyren for the UCGs to review.

nikomatsakis (Dec 13 2019 at 13:17, on Zulip):

@Amanieu ok, well, I didn't try to make it a "catch" -- in part because in the C++ code I already have a catch in scope, so I'm surprised it's not being found. i.e., my C++ code is

extern "C"
int triple_input(int input) {
    throw 22;
}

int main() {
    try {
        int input = 10;
        int output = double_input(input);
        cout<<input<<" * 2 = "<<output<<"\n";
    } catch (int i) {
        cout << "caught " << i << "\n";
    }
  return 0;
}

where double_input is defined in Rust and invokes triple_input.

nikomatsakis (Dec 13 2019 at 13:17, on Zulip):

I'm also not sure how to do that :P and it sounds like more work

nikomatsakis (Dec 13 2019 at 13:18, on Zulip):

as far as Windows and longjmp, I agree, we'd have to figure out this interaction, I'm mostly just trying to get some kind of rough-n-dirty upper bound on the overall size impact of pursuing a plan where "C" unwinds by default, but -Cpanic=abort doesn't just create UB if you happen to have unwinding occur (but rather aborts in a structured way)

nikomatsakis (Dec 13 2019 at 13:18, on Zulip):

presumably it is possible to distinguish a longjmp unwind from other unwinds

nikomatsakis (Dec 13 2019 at 13:20, on Zulip):

anyway I don't quite get why it's not working, there must be some other flag somewhere that I'm overlooking I guess

Amanieu (Dec 13 2019 at 13:20, on Zulip):

triple_input is nounwind, maybe that's the issue?

nikomatsakis (Dec 13 2019 at 13:21, on Zulip):

ah, maybe it's the nounwind attributes

nikomatsakis (Dec 13 2019 at 13:23, on Zulip):

er, I see we reached similar conclusion together :) yes, I'll disable those

nikomatsakis (Dec 13 2019 at 13:24, on Zulip):

(for some weird reason your msg didn't show up until I refreshed...)

nikomatsakis (Dec 13 2019 at 13:43, on Zulip):

Hmm, that doesn't seem to have made any difference, but I'm also wondering if I was wrong about that being the LLVM-IR. When I run cargo rustc -- --emit llvm-ir I actually get an error because the rustc command itself includes --emit flags.

nikomatsakis (Dec 13 2019 at 13:45, on Zulip):

well, pretty sure that this is the LLVM IR, but I don't see much

nikomatsakis (Dec 13 2019 at 13:45, on Zulip):

/me shrugs

nikomatsakis (Dec 13 2019 at 13:47, on Zulip):

OK, I'm about ready to give up :)

nikomatsakis (Dec 13 2019 at 13:49, on Zulip):

I'd say @Kyle Strand that we ought to try to write-up a summary and maybe ask for help from somebody who understands this stuff better to suggest (a) how to measure the impact and (b) what the right setup would even look like to enable us to abort when unwinding occurs

gnzlbg (Dec 13 2019 at 13:53, on Zulip):

@nikomatsakis how is double_input defined ? it just calls triple_input ?

gnzlbg (Dec 13 2019 at 13:54, on Zulip):

there is a lot of debug info and formatting going on in that LLVM-IR (using -C debuginfo=0 should remove some of it)

gnzlbg (Dec 13 2019 at 13:55, on Zulip):

I suppose double_input prints something to the screen, and then calls triple_input ?

gnzlbg (Dec 13 2019 at 13:56, on Zulip):

the llvm-ir for that part looks ok to me, since triple_input can throw, it is called with invoke, and if that unwinds it goes to cleanup, which does some stuff i don't understand but then goes to bb5 which traps

gnzlbg (Dec 13 2019 at 13:57, on Zulip):

ah, I see the problem, that shouldn't trap but continue unwinding

gnzlbg (Dec 13 2019 at 13:57, on Zulip):

or should it? are you compiling the Rust with -C panic=abort ? if so, that would be what you want

gnzlbg (Dec 13 2019 at 13:57, on Zulip):

(to let triple_input unwind without UB, and then abort)

nikomatsakis (Dec 16 2019 at 14:42, on Zulip):

@gnzlbg exactly, the intention was to see what it would take to make -Cpanic=abort abort if unwinding occurred, rather than creating UB. So far, I've been able to get it to abort, but not via the mechanism I expected =) i.e., it aborts because it can't find any in-scope catch handler, versus unwinding and then aborting when it passes into Rust code.

nikomatsakis (Dec 16 2019 at 14:42, on Zulip):

I don't know exactly how C++ specific that is

nikomatsakis (Dec 16 2019 at 14:43, on Zulip):

(e.g., maybe there are libunwind methods I could trigger that would just unwind more indiscriminantely?)

gnzlbg (Dec 16 2019 at 14:43, on Zulip):

I'm not sure I understand

nikomatsakis (Dec 16 2019 at 14:43, on Zulip):

but I also don't know why it can't find the catch block in scope, since afaik we've setup the function's metadata correctly

nikomatsakis (Dec 16 2019 at 14:43, on Zulip):

what don't you understand?

gnzlbg (Dec 16 2019 at 14:43, on Zulip):

From looking at the LLVM IR, it appears that if it unwinds, it would run the cleanup, and the cleanup aborts

nikomatsakis (Dec 16 2019 at 14:43, on Zulip):

that is the goal, yes

nikomatsakis (Dec 16 2019 at 14:44, on Zulip):

except that it's not what happens

gnzlbg (Dec 16 2019 at 14:44, on Zulip):

ah ok :D, no idea why :/ from looking at the IR, it should work

gnzlbg (Dec 16 2019 at 14:44, on Zulip):

maybe the IR is missing something :/

gnzlbg (Dec 16 2019 at 14:47, on Zulip):

you might want to ask for help in the wg-llvm, somebody there might "just see" what's missing

nikomatsakis (Dec 16 2019 at 14:48, on Zulip):

Yeah, maybe. The specific problem is that C++ terminates because it can't find any catch block for int

gnzlbg (Dec 16 2019 at 14:50, on Zulip):

Do you have the LLVM-IR produced by C++ ?

gnzlbg (Dec 16 2019 at 14:50, on Zulip):

or is there a way to combine them into a single .ll file that reproduces the problem? That would help

nikomatsakis (Dec 16 2019 at 14:50, on Zulip):

I don't, but as we were typing I was thinking maybe I should look at that

gnzlbg (Dec 16 2019 at 14:51, on Zulip):

that might help, I recommend avoiding "print" functionality, and setting debug-info to "off" to get the simplest IR possible

Last update: May 27 2020 at 20:55UTC