Stream: project-ffi-unwind

Topic: Is `unwind` any kind of abnormal exit?


view this post on Zulip nagisa (Sep 25 2021 at 11:32):

For the purposes of this project and all of its outputs, are all kinds of abnormal exits (unwind, longjmp, "signal handler manually fiddles with program counter and stack pointer", "kernel manually fiddles with program counter and stack pointer" ...) considered an "unwind"? Most interesting to me is MSVC which implements longjmp as an unwind but GNU does not. Do we consider longjmp an unwind everywhere? nowhere? only on msvc?

view this post on Zulip nagisa (Sep 25 2021 at 11:33):

cc @Nikita Popov

view this post on Zulip bjorn3 (Sep 25 2021 at 11:39):

setjmp/longjmp is declared UB in rust when jumping over frames with destructors according to the RFC. It doesn't matter if it is internally implemented as unwinding (windows) or register restore (unix).

view this post on Zulip nagisa (Sep 25 2021 at 11:46):

In our case of interest are the abort-on-unwind drops. Do we think it is always UB to longjmp out of drop glue?

view this post on Zulip nagisa (Sep 25 2021 at 11:47):

Relatedly, what about longjmping _into_ a frame with destructors, not over it?

view this post on Zulip bjorn3 (Sep 25 2021 at 11:59):

You can't longjmp into a function call. You can only go up to callees, not down to previously returned functions as the stack has been clobbered.

view this post on Zulip nagisa (Sep 25 2021 at 12:40):

I don't think jumping into a function call is happening in what I'm asking about? More specifically I'm talking about a case like

struct LongJmp; impl Drop for LongJmp { fn drop(&self) { longjmp() } }

fn has_destructors() {
     setjmp();
     drop(LongJmp); // jumps out of drop into a frame with destructors (but not over)
}

if that makes sense?

view this post on Zulip nagisa (Sep 25 2021 at 12:40):

or whatever equivalents.

view this post on Zulip bjorn3 (Sep 25 2021 at 12:47):

I see. Not sure.

view this post on Zulip Connor Horman (Sep 25 2021 at 12:54):

In C++ that's not undefined behaviour, only jumping over. I'd assume the same for Rust, tbh.

view this post on Zulip Connor Horman (Sep 25 2021 at 12:56):

Although it would require rust exposing setjmp, since it's not a function and you can't grab it with FFI.

view this post on Zulip Connor Horman (Sep 25 2021 at 12:59):

(And if you tried to do a wrapper function, it would have to be of the form of a with_setjmp that takes a closure, since once you've returned from a function, it's UB to longjmp back into it)

view this post on Zulip nagisa (Sep 25 2021 at 13:30):

Does C++ still allow unwinding from their destructors? Or are they nounwind?

view this post on Zulip nagisa (Sep 25 2021 at 13:31):

And if it doesn't why does that work on msvc then?

view this post on Zulip Connor Horman (Sep 25 2021 at 13:36):

nagisa said:

Does C++ still allow unwinding from their destructors? Or are they nounwind?

C++ doesn't treat unwinds from longjmp the same as exceptions, but yes to both. However, most destructors are noexcept (which terminates if they exit via an exception), and exiting a destructor called as a result of stack unwinding via an exception also terminates.

view this post on Zulip Connor Horman (Sep 25 2021 at 13:39):

http://eel.is/c++draft/csetjmp.syn#2

The function signature longjmp(jmp_­buf jbuf, int val) has more restricted behavior in this document. A setjmp/longjmp call pair has undefined behavior if replacing the setjmp and longjmp by catch and throw would invoke any non-trivial destructors for any objects with automatic storage duration. A call to setjmp or longjmp has undefined behavior if invoked in a suspension context of a coroutine ([expr.await]).

view this post on Zulip Connor Horman (Sep 25 2021 at 13:40):

Nothing about longjmping out of destructors, so it falls back to the definition in C.

view this post on Zulip nagisa (Sep 25 2021 at 13:45):

Hm, this still doesn't directly say if in C++ setjmp/longjmp are considered to be a form of unwinding (though if I had to hazard a guess, unwinding is an implementation detail in the eyes of the C++ standard and not something that needs to be specified? there's only two references to unwinding and neither a specification of what it is.

view this post on Zulip nagisa (Sep 25 2021 at 13:48):

Here's another example

struct LongJmp; impl Drop for LongJmp { fn drop(&self) { longjmp() } }
struct Boom; impl Drop for Boom { fn drop(&self) { abort() } }

fn has_destructors() {
     setjmp();
     let boom = Boom;
     let guard = LongJmp;
     // cleanups here jumps out of drop into a frame with destructors (but not over)
}

As per C++ specification this is UB because replacing the jumps with throw/catch would invoke a destructor for Boom, I believe.

view this post on Zulip nagisa (Sep 25 2021 at 13:51):

This is something our POF-based definition doesn't capture.

view this post on Zulip nagisa (Sep 25 2021 at 13:53):

and for -Zpanic-in-drop=abort even the first example UB would be UB as per Rust definition AFAICT, for implementing abort-on-panic involves adding an equivalent of catch_unwind, making drop glue be a non-POF as per https://github.com/rust-lang/rfcs/blob/master/text/2945-c-unwind-abi.md#plain-old-frames

view this post on Zulip nagisa (Sep 25 2021 at 13:55):

Connor Horman said:

a destructor called as a result of stack unwinding via an exception also terminates.

Isn't that like double-panic in Rust? This is reasonable and understandable behaviour.

view this post on Zulip nagisa (Sep 25 2021 at 13:56):

-Zpanic-in-drop=abort restricts that kind of behaviour to any panic in drop/destructor being an abort (or std::terminate() in C++ parlance), even if not currently unwinding

view this post on Zulip nagisa (Sep 25 2021 at 14:02):

nagisa said:

This is something our POF-based definition doesn't capture.

And we cannot really use the C++-like definition either because of -Cpanic=abort

view this post on Zulip BatmanAoD (Kyle Strand) (Sep 25 2021 at 18:34):

So that everyone has a bit of context, here's the blog post announcing that this project group would be exploring longjmp: https://blog.rust-lang.org/inside-rust/2021/01/26/ffi-unwind-longjmp.html

nagisa said:

For the purposes of this project and all of its outputs, are all kinds of abnormal exits... considered an "unwind"? ... Do we consider longjmp an unwind everywhere? nowhere? only on msvc?

No, and nowhere. We did, however, broaden our charter to bring longjmp into our purview. Currently, all interactions between longjmp and Rust frames should be considered undefined, because they have not been specified in any way, but as @bjorn3 noted, RFC-2945 only makes longjmp over non-POF frames (which generally means frames with destructors) explicitly (and permanently) undefined.

nagisa said:

In our case of interest are the abort-on-unwind drops. Do we think it is always UB to longjmp out of drop glue?

I'm not sure what you mean here; abort will never call drop, abort-on-unwind should never trigger a longjmp even if you call longjmp in a drop function. But yes, I do expect that longjmp out of a drop function would be UB in the non-abort case.

@Connor Horman I'm not sure that longjmp inside a destructor would be well-defined in C++ either, actually. The fact that throwing from a destructor causes the runtime to terminate doesn't seem to me to imply that the behavior of longjmp in the same context would "fall back to" that of the ISO C; I don't believe there's any such rule for interpreting the standard, since C++ has different semantics from C in quite a few places.

I've heard of several people in fact using longjmp in Rust; that's a large part of the motivation for exploring how to make that well-defined. I would assume that some kind of closure is indeed how longjmp is being invoked, but I haven't looked at actual code examples.

nagisa said:

This is something our POF-based definition doesn't capture.

Anything not captured by the POF definition should be treated with extreme caution. As I said above, even a simple longjmp over POFs is still technically UB in Rust simply because there's no specification for it yet. Additionally, per the above blogpost, we are leaning toward restricting longjmp further by introducing a new annotation required to make the behavior well defined.

view this post on Zulip Connor Horman (Sep 25 2021 at 18:49):

The fact that throwing from a destructor causes the runtime to terminate doesn't seem to me to imply that the behavior of longjmp in the same context would "fall back to" that of the ISO C; I don't believe there's any such rule for interpreting the standard, since C++ has different semantics from C in quite a few places.

Functions defined by the C standard have the behaviour specified by cross-reference to ISO 9899:2018 (for ISO 14882:2020 and latest draft of the C++ standard), so except where the behaviour is altered by the C++ standard, it has the behaviour specified in the normative cross-reference.

view this post on Zulip Connor Horman (Sep 25 2021 at 18:52):

Side note: it's not throwing from a destructor that causes termination - it's specifically throwing during stack unwinding.
This code will throw an exception, for example, rather than terminate:

void foo(){
     struct S{
        ~S() noexcept(false) // required as C++ specifies that destructors without an exception specification use the same one as the implicitly-declared version
        {
             throw std::exception{"Destroyed S"};
        }
    };
    S s;
}    // Exception is thrown here and unwinds the stack if caught. No std::terminate call, no undefined behaviour

Same as how rust let's you panic!() from a drop impl, but only panicking during another panic aborts (although that's earlier than C++ - since it happens immediately at the second panic, rather than when unwinding out of the destructor).

view this post on Zulip BatmanAoD (Kyle Strand) (Sep 25 2021 at 18:55):

But C has no rules about what happens "during unwinding", so this behavior isn't defined there either.

view this post on Zulip BatmanAoD (Kyle Strand) (Sep 25 2021 at 18:56):

It's not even clear to me what the expected behavior would be.

view this post on Zulip BatmanAoD (Kyle Strand) (Sep 25 2021 at 18:57):

E.g. would unwinding stop at the setjmp or continue up the stack?

view this post on Zulip nagisa (Sep 25 2021 at 20:12):

BatmanAoD (Kyle Strand) said:

...
I'm not sure what you mean here; abort will never call drop, abort-on-unwind should never trigger a longjmp even if you call longjmp in a drop function. But yes, I do expect that longjmp out of a drop function would be UB in the non-abort case.
...

I mean -Zpanic-in-drop=abort here, sorry if I wasn't clear. Basically the expected behaviour with that flag enabled is that unwinding from a drop gets converted to an abort, regardless of what the program-wide -Cpanic setting is. Was wondering what is the behaviour we'd like to prescribe here for longjmp, but since we want to never consider longjmp to be an unwind, that question is moot.

view this post on Zulip BatmanAoD (Kyle Strand) (Sep 25 2021 at 20:40):

Well, unfortunately, we can't really prescribe a behavior other than "undefined" in a lot of longjmp cases, since a longjmp can occur pretty much anywhere. That's one major reason for wanting an annotation to limit where well-defined longjmp can happen, and leaving it UB everywhere else.

view this post on Zulip nagisa (Sep 25 2021 at 20:59):

Yeah, thats fine. I was more worried about us having some definition here ^^ No definition is actually easier to work with today ^^

view this post on Zulip Jubilee (Sep 25 2021 at 23:36):

So as I understand this, even if longjmp was permitted to touch Rust, since it can't go anywhere too exciting and has to use a normal entry point, it would basically be semantically equivalent to emitting a call opcode, just with an arbitrary selection of actual opcodes?

view this post on Zulip BatmanAoD (Kyle Strand) (Sep 27 2021 at 00:29):

I'm not sure I understand the question, possibly because I know hardly anything about the LLVM language and opcodes. What do you mean by "a normal entry point"? If the question is whether longjmp can be thought of as unwinding without landing pads, that's probably entirely true on Windows (where forced_unwind actually does just skip landing pads), and I don't see any immediate problems with that mental model for other OSes.

view this post on Zulip Jubilee (Oct 06 2021 at 21:08):

The sense of "entry point" I meant is that it has to go to a block that would have already been "semantically addressable" within Rust. Like the start of a function is the canonical place you can "address", but yes, also initiating an unwinding procedure via panic!, or other join-points in control flow that you can reach via loops and breaks... in other words it can't do "jump into a random point in Rust control flow and thus potentially skip over meaningful steps", it would have to go to one of the existing join-points.

view this post on Zulip BatmanAoD (Kyle Strand) (Oct 06 2021 at 21:41):

I feel like I'm treading into dangerous waters attempting to answer this, but yes, I would expect that to be a standard limitation on longjmp regardless of the caller's language. setjmp must be called first, and longjmp can only go back to where setjmp was called.

view this post on Zulip Jubilee (Oct 06 2021 at 22:44):

Cool cool. I will not take your words as gospel but rather am just trying to begin to form an intuition.


Last updated: Jan 26 2022 at 08:46 UTC