Stream: project-ffi-unwind

Topic: Catching and resuming foreign exceptions


BatmanAoD (Kyle Strand) (Feb 27 2020 at 17:31, on Zulip):

RalfJ said:

bjorn3 said:

Indeed, but there will be in the future.

this seems pretty relevant for the unwinding discussion then -- adding kinds of uwninding that catch_unwind does not catch could make take_mut, rayon and other libraries unsound

Amanieu said:

But yes, catch_unwind can only catch Rust panics because it has to return a Box<Any>. This is not possible for foreign exceptions.

I guess we could return a Box<ForeignException>, but then if you try to rethrow that with resume_unwind you get a Rust panic, not the original foreign exception.

It sounds like we have existing libraries using catch_unwind to do clean-up work that's required for soundness.

BatmanAoD (Kyle Strand) (Feb 27 2020 at 17:34, on Zulip):

I would guess (but need to verify somehow) that such clean-up doesn't actually need to interact with the exception objects themselves. In that case, maybe the exception object exposed can (for now) just be some kind of opaque object that doesn't provide any information beyond "this is a foreign exception", and in those cases resume_unwind could somehow just resume unwinding with the original exception. Would that be possible?

BatmanAoD (Kyle Strand) (Feb 27 2020 at 17:36, on Zulip):

It seems like the fact that resume_unwind creates a panic is an implementation detail, not a requirement of the API... is that correct?

Amanieu (Feb 27 2020 at 17:50, on Zulip):

It is not possible to "capture" a foreign exception.

Amanieu (Feb 27 2020 at 17:51, on Zulip):

Let me double-check this

Amanieu (Feb 27 2020 at 17:57, on Zulip):

I think it might be possible to capture an exception with libunwind.

Amanieu (Feb 27 2020 at 17:57, on Zulip):

Let me check SEH

Amanieu (Feb 27 2020 at 17:59, on Zulip):

It's probably possible, but it would be very difficult (SEH is poorly documented)

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:01, on Zulip):

Well, if the type is opaque, it just needs to behave like C++'s catch (...) { /* clean-up code */ throw; }, right?

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:02, on Zulip):

The complication, of course, being that the clean-up takes place between the call to catch_unwind and the call to resume_unwind, rather than in a block :confused:

Amanieu (Feb 27 2020 at 18:02, on Zulip):

Ah but that doesn't work for us. We need to capture the exception into Box in such a way that it can be sent to another thread and rethrown there.

Amanieu (Feb 27 2020 at 18:02, on Zulip):

It's closer to C++'s exception_ptr

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:02, on Zulip):

Why sent to another thread? Can't we just require that (at least for now) resume_unwind be called in the same function in which catch_unwind is called?

Amanieu (Feb 27 2020 at 18:03, on Zulip):

What do you think rayon is doing with its caught panics? :P

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:03, on Zulip):

And if it's an opaque object denoting "the current foreign exception", does it even need a handle to the actual exception object?

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:03, on Zulip):

Uh.... I thought it was just "clean up"

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:03, on Zulip):

I did not look!

Amanieu (Feb 27 2020 at 18:03, on Zulip):

It propagate the panic up to the caller of a join.

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:03, on Zulip):

That does complicate things. I guess I need to look into what specific guarantees are required for Rayon to be sound when foreign exceptions occur

Amanieu (Feb 27 2020 at 18:04, on Zulip):

Well rayon would just crash since there's nothing to catch the foreign exception at the root of the thread.

Amanieu (Feb 27 2020 at 18:06, on Zulip):

We can definitely return an opaque Box<ForeignException>, that should be relatively easy to do. It might cost us a bit of code size on msvc target since we would be effectively lowering to

try {
    fn();
} catch (rust_panic&) {
    // Capture Rust panic
} catch (...) {
    // Capture foreign exception
}
BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:10, on Zulip):

I suspect we'll need to look into modifying the catch/resume API at some point. It seems like we probably want users to be able to catch _just_ Rust panics, if there's no soundness-cleanup required.

Amanieu (Feb 27 2020 at 18:14, on Zulip):

If it's soundness that we are worried about then catching foreign exceptions and returning Box<ForeignException> should fix the soundness issue. Rethrowing the ForeignException as a Rust panic instead of a C++ exception shouldn't cause correctness issues.

Amanieu (Feb 27 2020 at 18:15, on Zulip):

Of course, this catch would need to ignore forced exceptions. Or it could attempt to catch them and abort. Either way is fine since catch_unwind counts as a destructor for the purposes of FFI unwind rules so you're already in UB land.

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:27, on Zulip):

It seems like there ought to be a way to catch a forced exception, do some clean-up, and then resume from the same thread, though, right?

Amanieu (Feb 27 2020 at 18:29, on Zulip):

.. but why would you ever want to catch a longjmp or a pthread_exit

Amanieu (Feb 27 2020 at 18:29, on Zulip):

That's just a recipe for disaster in both cases

Amanieu (Feb 27 2020 at 18:29, on Zulip):

We're better off just letting them unwind through.

bjorn3 (Feb 27 2020 at 18:31, on Zulip):

longjmp across rust code is UB because it may not unwind in some cases.

BatmanAoD (Kyle Strand) (Feb 27 2020 at 18:49, on Zulip):

I don't know... I realize asking "why does C++ allow it" is a recipe for banging one's head against one's desk, but I do wonder about whether catch (...) has any actual value or if it was entirely a mistake.

RalfJ (Feb 28 2020 at 09:41, on Zulip):

maybe another option would be to make catch_panic abort on non-panic unwind, and then if/when we have clear usecases where that is a problem, design another API for them?

Amanieu (Feb 28 2020 at 09:42, on Zulip):

Either way it requires catching the foreign unwind, which may be non-trivial.

Amanieu (Feb 28 2020 at 09:43, on Zulip):

And tbh, that code really should have been using scope guard rather then catch_unwind, which is much slower.

Amanieu (Feb 28 2020 at 09:43, on Zulip):

*for take_mut

Amanieu (Feb 28 2020 at 09:44, on Zulip):

In the case of rayon catch_unwind is the correct thing to do, but foreign exception currently will just go straight to the root of the thread and abort the whole program, so it's not really unsound.

bjorn3 (Feb 28 2020 at 10:01, on Zulip):

A scope guard only works for rust panics, as std::thread::panicking only returns true when a rust panic has started. It doesn't check if it is currently unwinding.

Amanieu (Feb 28 2020 at 10:10, on Zulip):

Scope guards work for all types of unwinding.

Amanieu (Feb 28 2020 at 10:22, on Zulip):

Oh wait I see what you mean.

Amanieu (Feb 28 2020 at 10:22, on Zulip):

Yea scopeguard only works in the default mode.

nagisa (Feb 28 2020 at 16:21, on Zulip):

RalfJ said:

it is legitimate (or at least, not unreasonable) for code to assume that every (Rust) function either returns or panics

This is already wrong – it can also abort/exit though I guess those cases are not too interesting. On Windows abort is implemented as a special kind of unwind though.

nagisa (Feb 28 2020 at 16:23, on Zulip):

I’m not sure why you’re saying scope guards work for all types of unwinding. I don’t think they do for arbitrary choices of unwinding implementation and personality function.

nagisa (Feb 28 2020 at 16:24, on Zulip):

A perfect example here is, again, windows abort unwind. It uses the same unwinding mechanism it just carries special payload and thus just happens to ignore the typical handlers that may exist on stack of a typical program.

Amanieu (Feb 28 2020 at 17:25, on Zulip):

On windows we only run destructors for C++ exceptions (-gnu and -msvc) and longjmp (-msvc only). Not for arbitrary SEH exceptions.

Amanieu (Feb 28 2020 at 17:27, on Zulip):

@nagisa https://docs.rs/scopeguard/1.1.0/scopeguard/fn.guard_on_unwind.html

Amanieu (Feb 28 2020 at 17:28, on Zulip):

The is the relevant function. The intent is to only run some cleanup code when unwinding. However this won't work correctly if a foreign exception comes, since it checks std::thread::panicking which will return false for foreign exceptions.

RalfJ (Feb 28 2020 at 18:02, on Zulip):

nagisa said:

This is already wrong – it can also abort/exit though I guess those cases are not too interesting. On Windows abort is implemented as a special kind of unwind though.

in that case nothing matters as no code (of this program) runs any more. but, sure, if you want to be explicit about it:
every Rust function either returns or panics or halts execution of the abstract machine / the entire program.

RalfJ (Feb 28 2020 at 18:03, on Zulip):

nagisa said:

A perfect example here is, again, windows abort unwind. It uses the same unwinding mechanism it just carries special payload and thus just happens to ignore the typical handlers that may exist on stack of a typical program.

I think different people are using "unwinding" in different ways here... what you describe here sounds to me like an unobservable implementation detail of how abort is implemented on windows

RalfJ (Feb 28 2020 at 18:05, on Zulip):

and sure, one can do all sorts of nastiness by messing with SEHs on Windows, or messing with /proc/self/mem on Linux. all of that is "platform-UB", as in the specific implementation of the abstract machine for that platform just assumes you will not do that.
the parts we care about are cases of unwinding that we do support and specify. so far that's just panicking so "scope guards work for all [supported] types of unwinding" is trivially true. once more types of unwinding are possible, well, things become interesting and "scope guards work for all [supported] types of unwinding" is likely something we want to guarantee.

nagisa (Feb 29 2020 at 01:23, on Zulip):

RalfJ said:

I think different people are using "unwinding" in different ways here... what you describe here sounds to me like an unobservable implementation detail of how abort is implemented on windows

It is not unobservable, you can do it if you _really_ want to. As thus we need to be extremely careful about what terminology we use. And exactly the point I was trying to make: saying that "arbitrary" unwind can be handled… or whatever, is not tenable unless we try _really_ hard to actually enumerate all possible ways of unwinding, including the cases where they are mostly stable implementation details.

nagisa (Feb 29 2020 at 01:29, on Zulip):

Saying that C++ exceptions, longjmp and Rust panics are supported and everything else is still UB is way more correct than "all unwinding"

nagisa (Feb 29 2020 at 01:30, on Zulip):

I rest my case.

BatmanAoD (Kyle Strand) (Feb 29 2020 at 01:47, on Zulip):

I understand you're making a point, but "I rest my case" is somewhat combative.

nagisa (Feb 29 2020 at 01:54, on Zulip):

First dictionary in the search engine says:

I have completed the presentation of my argument.

I don’t see how that’s combative, and it wasn't meant to come across as such.

RalfJ (Feb 29 2020 at 08:47, on Zulip):

I think it is unobservable without UB -- how would I observe it?

I think you basically confirmed what I said -- we have to carefully distinguish "unwinnding in Rust, part of the spec, observable in the Abstract Machine" from "unwindining on the underlying implementation".

Amanieu (Mar 20 2020 at 19:35, on Zulip):

Update: actually it turns out that we can't safely capture a foreign exception and then resume it on a different thread.

Amanieu (Mar 20 2020 at 19:36, on Zulip):

C++ exceptions have pointers into thread-local state, which means they can't be Send. Therefore we can't capture it in a Box<Any + Send>.

BatmanAoD (Kyle Strand) (Mar 20 2020 at 19:37, on Zulip):

That makes sense to me - does the standard library provide a copy operator?

Amanieu (Mar 20 2020 at 19:38, on Zulip):

The thing is, we're only dealing with exception objects at the libunwind level. We don't see the C++-specific state.

Amanieu (Mar 20 2020 at 19:43, on Zulip):

The "proper" way of sending a C++ exception to another thread is to catch it with __cxa_begin_catch, which removes it from the thread's linked list of active exceptions. The exception itself is reference-counted and can be rethrown in another thread.

Amanieu (Mar 20 2020 at 19:43, on Zulip):

But we can't do that since we would need to depend on the C++ runtime for unwinding.

BatmanAoD (Kyle Strand) (Mar 20 2020 at 19:45, on Zulip):

Wouldn't we only need to do that in cases where the C++ runtime is already linked in the process, though?

Amanieu (Mar 20 2020 at 19:51, on Zulip):

Hmm. I guess we could do some trickery with weak symbols.

Amanieu (Mar 20 2020 at 19:56, on Zulip):

Won't work on OSX though, I'm pretty sure.

Amanieu (Mar 20 2020 at 20:08, on Zulip):

Then again, on macOS libc++ is always available as part of the system, so it doesn't matter too much if we link to it.

Amanieu (Mar 20 2020 at 20:10, on Zulip):

Definitely won't fly on mingw though

BatmanAoD (Kyle Strand) (Mar 20 2020 at 20:31, on Zulip):

If an exception is thrown on another thread, the expectation that it would have an upstream handler somewhere?

Amanieu (Mar 20 2020 at 21:01, on Zulip):

Consider something like this: a C++ program uses a Rust lib that uses rayon.

Amanieu (Mar 20 2020 at 21:02, on Zulip):
BatmanAoD (Kyle Strand) (Mar 20 2020 at 21:08, on Zulip):

Ah, okay, so it definitely needs to be the original exception

nikomatsakis (Mar 23 2020 at 20:17, on Zulip):

It seems like more and more we are saying that that C++ exceptions propagating into Rust is something done "with caution" at best

nikomatsakis (Mar 23 2020 at 20:18, on Zulip):

In some sense, the only case that we're really enabling is the case of a Rust panic propagating across foreign frames

nikomatsakis (Mar 26 2020 at 21:21, on Zulip):

@Amanieu I have some questions -- you mentioned that rethrowing a Box<ForeignException> will be translated to a Rust panic. Is that because of the ability to send the Box<dyn Any> across threads?

nikomatsakis (Mar 26 2020 at 21:21, on Zulip):

Or are there other reasons?

nikomatsakis (Mar 26 2020 at 21:21, on Zulip):

I'm wondering basically whether a catch_unwind2 API could permit catching and rethrowing foreign exceptions

nikomatsakis (Mar 26 2020 at 21:22, on Zulip):

Ah, right, it had also to do with the C++ runtime

nikomatsakis (Mar 26 2020 at 21:22, on Zulip):

(Is there any way to make that a bit more opt-in?)

Amanieu (Mar 26 2020 at 21:23, on Zulip):

@nikomatsakis See this comment: https://github.com/rust-lang/rust/pull/70212#issuecomment-602230588

Amanieu (Mar 26 2020 at 21:24, on Zulip):

I don't think any API change is necessary to support rethrowing foreign exceptions if we decide to support it in the future. We could just make Box<ForeignException> magically turn back into the original exception.

nikomatsakis (Mar 26 2020 at 21:26, on Zulip):

I see

nikomatsakis (Mar 26 2020 at 21:26, on Zulip):

I guess my question is --

nikomatsakis (Mar 26 2020 at 21:26, on Zulip):

could we set things up in such a way that

nikomatsakis (Mar 26 2020 at 21:27, on Zulip):

if you used the catch_unwind_cpp,

nikomatsakis (Mar 26 2020 at 21:27, on Zulip):

that did these extra calls,

nikomatsakis (Mar 26 2020 at 21:27, on Zulip):

but invoking that also triggers us linking in C++ library :)

nikomatsakis (Mar 26 2020 at 21:27, on Zulip):

e.g., in my ideal "mental model", it'd be in some crate where the c++ lib is a dependency

BatmanAoD (Kyle Strand) (Mar 26 2020 at 21:28, on Zulip):

Perhaps it would be better to have a compiler/linker flag saying "catch_unwind should be able to catch C++ exceptions"?

Amanieu (Mar 26 2020 at 21:29, on Zulip):

We could try doing magic with dlsymto check if __cxa_begin_catch exists at runtime.

nikomatsakis (Mar 26 2020 at 21:29, on Zulip):

BatmanAoD (Kyle Strand) said:

Perhaps it would be better to have a compiler/linker flag saying "catch_unwind should be able to catch C++ exceptions"?

yeah that's another option, I was pondering that

nikomatsakis (Mar 26 2020 at 21:29, on Zulip):

I'm just imagining that if you were trying to intermix C++ and Rust unwinding, you'd probably want your exception to have "true fidelity"

Amanieu (Mar 26 2020 at 21:30, on Zulip):

I can have a go at implementing runtime symbol lookup.

Amanieu (Mar 26 2020 at 21:31, on Zulip):

It honestly feels easier than messing with compiler options

BatmanAoD (Kyle Strand) (Mar 26 2020 at 21:33, on Zulip):

And of course much nicer from a user's perspective, I think!

simulacrum (Mar 26 2020 at 22:28, on Zulip):

I know that in a past project, I managed to get C++ exceptions working by capturing them into C++'s exception_ptr and then passing that as a pointer across the FFI boundary, panicking, catching that at the next ffi boundary, and so forth

simulacrum (Mar 26 2020 at 22:29, on Zulip):

Obviously in the ideal world that would just happen on its own, though.

simulacrum (Mar 26 2020 at 22:29, on Zulip):

(Writing those shims wasn't fun.)

nikomatsakis (Mar 28 2020 at 01:18, on Zulip):

simulacrum said:

I know that in a past project, I managed to get C++ exceptions working by capturing them into C++'s exception_ptr and then passing that as a pointer across the FFI boundary, panicking, catching that at the next ffi boundary, and so forth

interesting

Last update: Jun 05 2020 at 23:15UTC