Stream: project-ffi-unwind

Topic: implementing "abort on FFI call"


view this post on Zulip nikomatsakis (Nov 13 2019 at 15:21):

So I was thinking about how to measure the impact of a proposal where the C ABI permits unwinding but -- if we are -Cpanic=abort -- we abort if unwinding actually occurs (versus UB).

I think the way to do this would be to modify the no_landing_pads transformation. Right now, it just removes all landing pads. I think what we would actually want is to check if this is a function call and -- in the case where it is -- inspect the ABI and callee. If the ABI is not Rust (or Rust call), and the callee is not known, then we would instead rewrite the unwind to branch to a block with TerminatorKind::Abort.

We might or might not have such a block today, so we can (lazilly, and at most once) create such a block.

view this post on Zulip nikomatsakis (Nov 13 2019 at 15:21):

There might be some other assertions that need to be adjusted.

view this post on Zulip nikomatsakis (Nov 13 2019 at 15:43):

I put in some work towards implementing this in the abort-on-ffi-call branch -- not building yet or quite complete, but almost there (I think..?)

view this post on Zulip nikomatsakis (Nov 13 2019 at 16:19):

OK I think this is building now, but I really have to stop and do other stuff

view this post on Zulip nikomatsakis (Nov 13 2019 at 16:45):

(OK, it builds, but it doesn't work; I don't quite know what's going on, something to do with the remove_noop_landing_pads pass...)

view this post on Zulip nikomatsakis (Nov 13 2019 at 16:46):

oh, it's cause the landing pad has nothing to do I guess

view this post on Zulip nikomatsakis (Nov 14 2019 at 01:47):

ok, I think that is working now -- the MIR looks right, but it's sort of hard to test. I tried hacking up some C++ function that throws even though it's not supposed to, but C++ keeps calling std::terminate since it can't find a catch handler (true even when I put the Rust function sandwiched between two C++ functions, and one of them has a catch)

view this post on Zulip BatmanAoD (Kyle Strand) (Nov 14 2019 at 03:39):

Wow, the Rust sandwich breaks catch? I did not expect that at all.

view this post on Zulip nikomatsakis (Nov 14 2019 at 10:34):

I can only presume it has to do with interrupting the ability to walk up the stack and find the catch handler somehow? This is true on linux, at least.

view this post on Zulip nikomatsakis (Nov 14 2019 at 10:35):

Well hey, it does abort... :)

view this post on Zulip Amanieu (Nov 14 2019 at 13:48):

@nikomatsakis What does the LLVM IR look like?

view this post on Zulip gnzlbg (Nov 14 2019 at 13:49):

nikomatsakis 2:47 AM
ok, I think that is working now -- the MIR looks right, but it's sort of hard to test. I tried hacking up some C++ function that throws even though it's not supposed to, but C++ keeps calling std::terminate since it can't find a catch handler (true even when I put the Rust function sandwiched between two C++ functions, and one of them has a catch)

noexcept C++ functions have an "abort-on-panic" shim that calls std::terminate

view this post on Zulip gnzlbg (Nov 14 2019 at 13:50):

@nikomatsakis one needs to create a normal C++ function that throws, and then just use extern "C" { ... }

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:51):

@gnzlbg I did not declare anything noexcept

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:51):

I did create a normal C++ function

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:51):

I'll push the project

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:51):

@Amanieu I'm not sure :) I have to look how to save the LLVM IR, I forget

view this post on Zulip gnzlbg (Nov 14 2019 at 13:51):

I'll take a look

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:56):

@gnzlbg I pushed my example to this branch -- it was the rust-to-cpp and cpp-to-rust (sandwich)

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:57):

in both cases, I get the same error

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:57):

looking at the c++ referenece manual, it suggested that if there is no catch block then C++ will terminate (which fits what I'm seeing)

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:57):

it works without Rust in between

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:57):

(that is, the throw gets caught)

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:59):

@Amanieu I think the llvm output is in this gist

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:59):

I guess it has nounwind and other stuff

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:59):

oh wait

view this post on Zulip nikomatsakis (Nov 14 2019 at 13:59):

that wasn't built w/ my branch

view this post on Zulip nikomatsakis (Nov 14 2019 at 14:00):

llvm output from my branch

view this post on Zulip nikomatsakis (Nov 14 2019 at 14:02):

(this is a debug build)

view this post on Zulip Amanieu (Nov 14 2019 at 14:04):

@nikomatsakis Your function needs the uwtable attribute, otherwise dwarf tables aren't generated and the unwinder doesn't know how to unwind past your function to find the actual catch block in the caller.

view this post on Zulip nikomatsakis (Nov 14 2019 at 14:05):

@Amanieu ok -- would it make sense to have that only for the "leaf functions" that actually invoke FFI?

view this post on Zulip nikomatsakis (Nov 14 2019 at 14:05):

(after all, I'm going to catch any unwinding and trap)

view this post on Zulip Amanieu (Nov 14 2019 at 14:06):

Also in optimized builds the landing pad is probably going to be optimized away by LLVM as unreachable since triple_input is nounwind.

view this post on Zulip nikomatsakis (Nov 14 2019 at 14:07):

oh yeah, I need to supress that I guess

view this post on Zulip Amanieu (Nov 14 2019 at 14:07):

I think that just having no uwtable is good enough in practice to abort. The shim should only be needed on targets that require uwtable to be present for other reasons

view this post on Zulip Amanieu (Nov 14 2019 at 14:08):

Hmm, actually this only works for targets that use dwarf unwinding. SjLj unwinding doesn't use unwinding tables and always works. And I have no idea wtf SEH will do here.

view this post on Zulip Amanieu (Nov 14 2019 at 14:10):

The good news is, the overhead for abort on extern "C" unwinding is going to be 0 in most cases since it already aborts :)

view this post on Zulip gnzlbg (Nov 14 2019 at 14:13):

looking at the c++ referenece manual, it suggested that if there is no catch block then C++ will terminate (which fits what I'm seeing)

This makes sense. When unwinding hits a thread boundary, C++ calls std::terminate.

view this post on Zulip nikomatsakis (Nov 14 2019 at 14:13):

Hmm, actually this only works for targets that use dwarf unwinding. SjLj unwinding doesn't use unwinding tables and always works. And I have no idea wtf SEH will do here.

yeah I have been meaning to try this on windows -- I'm actually running windows, so that should be easy enough, except I've never tried to build Rust outside of WSL :laughter_tears:

view this post on Zulip nikomatsakis (Nov 14 2019 at 14:14):

I think that just having no uwtable is good enough in practice to abort. The shim should only be needed on targets that require uwtable to be present for other reasons

I don't quite follow-- I think you're saying the current branch status itself is fine, since either (a) the c++ code needs uwtable and it aborts when it doesn't find it (as today) or (b) some other environments may not need uwtable, in which case they will get trapped by the unwinding?

view this post on Zulip nikomatsakis (Nov 14 2019 at 14:15):

anyway i'd like to see the tra in action so I'll try to add the various attributes (and remove "nounwind") later on

view this post on Zulip Amanieu (Nov 14 2019 at 14:18):

Windows and Android have the requires_uwtable target attribute set, which means that unwinding tables are always generated even with panic=abort. For those platforms we will need to perform extra work to abort on unwind.

view this post on Zulip nikomatsakis (Dec 13 2019 at 00:10):

Hey @Amanieu I made some more changes here, but i'm still not able to get a C++ exception to actually propagate through Rust code. Here is the LLVM IR. I still can't get a throw 22 to propagate to Rust, I just see:

terminate called after throwing an instance of 'int'

Any idea what's going wrong?

In any case, I'm not 100% sure what we should be doing here. For example, generating uwtable attributes just so that we can abort later seems a bit silly, but I figured I'd like to see it die in a controlled fashion first.

view this post on Zulip Amanieu (Dec 13 2019 at 00:13):

I think what's happening here is due to 2 phase unwinding

view this post on Zulip Amanieu (Dec 13 2019 at 00:13):

The 1st phases searches for a suitable catch, while the second phase unwinds until that catch and calls cleanups.

view this post on Zulip Amanieu (Dec 13 2019 at 00:14):

However if there is no catch anywhere on the call chain then phase 1 fails and terminates.

view this post on Zulip Amanieu (Dec 13 2019 at 00:14):

Your landingpad is defined as a cleanup, which means that it is only executed in phase 2.

view this post on Zulip Amanieu (Dec 13 2019 at 00:15):

@nikomatsakis Could you point me to the Rust code for that again?

view this post on Zulip Amanieu (Dec 13 2019 at 00:15):

Ah nevermind, found it.

view this post on Zulip Amanieu (Dec 13 2019 at 00:17):

Yea, try defining the landingpad as a catch instead of a cleanup and see if that helps.

view this post on Zulip Amanieu (Dec 13 2019 at 02:02):

Actually I've just thought of another potential issue: even with panic=abort, we need to make sure longjmp works correctly on Windows. This means that we must allow foreign exceptions to unwind through Rust code even with panic=abort.

view this post on Zulip BatmanAoD (Kyle Strand) (Dec 13 2019 at 02:04):

Re: that last, @Alex Crichton says that already works; I can point you to the PR that fixed it

view this post on Zulip BatmanAoD (Kyle Strand) (Dec 13 2019 at 02:06):

Here: https://github.com/rust-lang/rust/pull/48572

view this post on Zulip Amanieu (Dec 13 2019 at 02:07):

@Kyle Strand scroll up to the top of this thread for the context.

view this post on Zulip Amanieu (Dec 13 2019 at 02:08):

So I was thinking about how to measure the impact of a proposal where the C ABI permits unwinding but -- if we are -Cpanic=abort -- we abort if unwinding actually occurs (versus UB).

view this post on Zulip Amanieu (Dec 13 2019 at 02:11):

@Kyle Strand That PR on aborts if a Rust panic tries to unwind past #[unwind(aborts)], but it lets foreign exceptions pass through. However if a foreign exception does unwind into code compiled with panic=aborts then we've still got UB.

view this post on Zulip BatmanAoD (Kyle Strand) (Dec 13 2019 at 02:13):

Are you saying that the landing pad generated by unwind(aborts) is different from the one generated by -Cpanic=abort?

view this post on Zulip BatmanAoD (Kyle Strand) (Dec 13 2019 at 02:14):

Or, oh, there's no landing pads. My bad.

view this post on Zulip BatmanAoD (Kyle Strand) (Dec 13 2019 at 02:14):

So we'd need to generate them but only if the function directly calls an extern function

view this post on Zulip Amanieu (Dec 13 2019 at 02:19):

Except that we want to allow longjmp over Rust code when there are no destructors on the stack, since the current consensus is that is not UB.

view this post on Zulip Amanieu (Dec 13 2019 at 02:20):

That should be allowed even with panic=abort

view this post on Zulip BatmanAoD (Kyle Strand) (Dec 13 2019 at 03:05):

Right; why is that UB?

view this post on Zulip BatmanAoD (Kyle Strand) (Dec 13 2019 at 03:05):

(currently)

view this post on Zulip BatmanAoD (Kyle Strand) (Dec 13 2019 at 03:12):

What effect does LLVM nounwind have in MSVC, anyway?

view this post on Zulip gnzlbg (Dec 13 2019 at 09:21):

nounwind does not mean "does not unwind", but rather "does not unwind in certain ways"

view this post on Zulip gnzlbg (Dec 13 2019 at 09:21):

nounwind functions can unwind with asynchronous exceptions on windows just fine (see the LLVM LangRef)

view this post on Zulip gnzlbg (Dec 13 2019 at 09:22):

on Linux, nounwind functions can unwind with forced unwinds just fine as well

view this post on Zulip gnzlbg (Dec 13 2019 at 09:22):

(e.g. like the ones used by longjmp, or pthread_cancel)

view this post on Zulip gnzlbg (Dec 13 2019 at 09:24):

Right; why is that UB?

Because there is no RFC guaranteeing that it isn't. I wrote one with Kyren for the UCGs to review.

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:17):

@Amanieu ok, well, I didn't try to make it a "catch" -- in part because in the C++ code I already have a catch in scope, so I'm surprised it's not being found. i.e., my C++ code is

extern "C"
int triple_input(int input) {
    throw 22;
}

int main() {
    try {
        int input = 10;
        int output = double_input(input);
        cout<<input<<" * 2 = "<<output<<"\n";
    } catch (int i) {
        cout << "caught " << i << "\n";
    }
  return 0;
}

where double_input is defined in Rust and invokes triple_input.

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:17):

I'm also not sure how to do that :P and it sounds like more work

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:18):

as far as Windows and longjmp, I agree, we'd have to figure out this interaction, I'm mostly just trying to get some kind of rough-n-dirty upper bound on the overall size impact of pursuing a plan where "C" unwinds by default, but -Cpanic=abort doesn't just create UB if you happen to have unwinding occur (but rather aborts in a structured way)

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:18):

presumably it is possible to distinguish a longjmp unwind from other unwinds

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:20):

anyway I don't quite get why it's not working, there must be some other flag somewhere that I'm overlooking I guess

view this post on Zulip Amanieu (Dec 13 2019 at 13:20):

triple_input is nounwind, maybe that's the issue?

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:21):

ah, maybe it's the nounwind attributes

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:23):

er, I see we reached similar conclusion together :) yes, I'll disable those

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:24):

(for some weird reason your msg didn't show up until I refreshed...)

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:43):

Hmm, that doesn't seem to have made any difference, but I'm also wondering if I was wrong about that being the LLVM-IR. When I run cargo rustc -- --emit llvm-ir I actually get an error because the rustc command itself includes --emit flags.

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:45):

well, pretty sure that this is the LLVM IR, but I don't see much

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:45):

/me shrugs

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:47):

OK, I'm about ready to give up :)

view this post on Zulip nikomatsakis (Dec 13 2019 at 13:49):

I'd say @Kyle Strand that we ought to try to write-up a summary and maybe ask for help from somebody who understands this stuff better to suggest (a) how to measure the impact and (b) what the right setup would even look like to enable us to abort when unwinding occurs

view this post on Zulip gnzlbg (Dec 13 2019 at 13:53):

@nikomatsakis how is double_input defined ? it just calls triple_input ?

view this post on Zulip gnzlbg (Dec 13 2019 at 13:54):

there is a lot of debug info and formatting going on in that LLVM-IR (using -C debuginfo=0 should remove some of it)

view this post on Zulip gnzlbg (Dec 13 2019 at 13:55):

I suppose double_input prints something to the screen, and then calls triple_input ?

view this post on Zulip gnzlbg (Dec 13 2019 at 13:56):

the llvm-ir for that part looks ok to me, since triple_input can throw, it is called with invoke, and if that unwinds it goes to cleanup, which does some stuff i don't understand but then goes to bb5 which traps

view this post on Zulip gnzlbg (Dec 13 2019 at 13:57):

ah, I see the problem, that shouldn't trap but continue unwinding

view this post on Zulip gnzlbg (Dec 13 2019 at 13:57):

or should it? are you compiling the Rust with -C panic=abort ? if so, that would be what you want

view this post on Zulip gnzlbg (Dec 13 2019 at 13:57):

(to let triple_input unwind without UB, and then abort)

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:42):

@gnzlbg exactly, the intention was to see what it would take to make -Cpanic=abort abort if unwinding occurred, rather than creating UB. So far, I've been able to get it to abort, but not via the mechanism I expected =) i.e., it aborts because it can't find any in-scope catch handler, versus unwinding and then aborting when it passes into Rust code.

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:42):

I don't know exactly how C++ specific that is

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:43):

(e.g., maybe there are libunwind methods I could trigger that would just unwind more indiscriminantely?)

view this post on Zulip gnzlbg (Dec 16 2019 at 14:43):

I'm not sure I understand

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:43):

but I also don't know why it can't find the catch block in scope, since afaik we've setup the function's metadata correctly

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:43):

what don't you understand?

view this post on Zulip gnzlbg (Dec 16 2019 at 14:43):

From looking at the LLVM IR, it appears that if it unwinds, it would run the cleanup, and the cleanup aborts

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:43):

that is the goal, yes

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:44):

except that it's not what happens

view this post on Zulip gnzlbg (Dec 16 2019 at 14:44):

ah ok :D, no idea why :/ from looking at the IR, it should work

view this post on Zulip gnzlbg (Dec 16 2019 at 14:44):

maybe the IR is missing something :/

view this post on Zulip gnzlbg (Dec 16 2019 at 14:47):

you might want to ask for help in the wg-llvm, somebody there might "just see" what's missing

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:48):

Yeah, maybe. The specific problem is that C++ terminates because it can't find any catch block for int

view this post on Zulip gnzlbg (Dec 16 2019 at 14:50):

Do you have the LLVM-IR produced by C++ ?

view this post on Zulip gnzlbg (Dec 16 2019 at 14:50):

or is there a way to combine them into a single .ll file that reproduces the problem? That would help

view this post on Zulip nikomatsakis (Dec 16 2019 at 14:50):

I don't, but as we were typing I was thinking maybe I should look at that

view this post on Zulip gnzlbg (Dec 16 2019 at 14:51):

that might help, I recommend avoiding "print" functionality, and setting debug-info to "off" to get the simplest IR possible


Last updated: Jan 26 2022 at 07:02 UTC