Stream: project-error-handling

Topic: stack backtrace formatting


view this post on Zulip Nick Cameron (Aug 12 2021 at 22:28):

Some spitballing kind of ideas here. Starting with an example which is just fn main() { Err(...).unwrap(); }

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:29):

Running this with RUST_BACKTRACE gives:

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:29):

stack backtrace:
   0: rust_begin_unwind
             at /rustc/ccffcafd55e58f769d4b0efc0064bf65e76998e4/library/std/src/panicking.rs:517:5
   1: core::panicking::panic_fmt
             at /rustc/ccffcafd55e58f769d4b0efc0064bf65e76998e4/library/core/src/panicking.rs:93:14
   2: core::result::unwrap_failed
             at /rustc/ccffcafd55e58f769d4b0efc0064bf65e76998e4/library/core/src/result.rs:1617:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/ccffcafd55e58f769d4b0efc0064bf65e76998e4/library/core/src/result.rs:1299:23
   4: hello::main
             at ./src/main.rs:11:5
   5: core::ops::function::FnOnce::call_once
             at /rustc/ccffcafd55e58f769d4b0efc0064bf65e76998e4/library/core/src/ops/function.rs:227:5

I would like to see

stack backtrace:
   0: core::result::Result<T,E>::unwrap
             at <core>/result.rs:1299:23
   1: hello::main
             at ./src/main.rs:11:5

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:34):

We already do some trimming (c.f., RUST_BACKTRACE=full), but I think we should do more!

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:36):

Could we do more? (Specifically here, I want to remove more stack frames which I don't think are generally useful, and remove parts of paths to some libs). Anyone have thoughts?

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:37):

I like the idea

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:38):

though I wonder if we should keep the frame numbers the same and add a message indicating that some frames were filtered out

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:38):

so intead it would be something like

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:38):

stack backtrace:
   3: core::result::Result<T,E>::unwrap
             at <core>/result.rs:1299:23
   4: hello::main
             at ./src/main.rs:11:5
   4 Frames omitted, use `RUST_BACKTRACE=full` to see a complete backtrace

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:39):

alternatively

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:39):

we might want to add support for this, but not make it the default

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:40):

i imagine some people might be upset if they expect to some frames and they capture a backtrace on a hard to reproduce error but then they're missing some frames that were important

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:40):

though I don't see how any of the frames we'd filter here would be important

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:40):

so probably not a risk, but worth being careful about

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:43):

"though I wonder if we should keep the frame numbers the same and add a message indicating that some frames were filtered out" - that would be inconsistent with the filtering we currently do

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:43):

"we might want to add support for this, but not make it the default" - we already filter quite aggressively by default, so I don't think that is a risk

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:44):

oh, maybe you are referring to the path shortening?

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:44):

Nick Cameron said:

"though I wonder if we should keep the frame numbers the same and add a message indicating that some frames were filtered out" - that would be inconsistent with the filtering we currently do

ah, fair

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:45):

no, no, I just forgot that we already filter the frames quite a bit

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:45):

I'm thinking about the filtering I have setup in color-eyre

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:45):

Maybe even:

stack backtrace:
   0: core::result::Result<T,E>::unwrap at <core>/result.rs:1299:23
   1: hello::main at ./src/main.rs:11:5

I think we only used two lines because of historical reasons - i.e., with the 'full' backtrace there is more data more entry

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:46):

although I guess the alignment makes things easier to read

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:46):

yea I prefer the first version you showed

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:46):

Screenshot-from-2021-08-12-15-46-13.png

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:47):

was where my suggestions were coming from

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:48):

I feel like it's pretty safe to filter out the panicking frames

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:48):

not confident that core::ops::function::FnOnce::call_once is something we should filter out

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:49):

tho that pretty much just indicates you're calling a closure, and the closure itself has its own stack frame I think

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:49):

let me test rq

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:51):

Yeah, the closure is an implementation detail of the __rust_begin_short_backtrace machinery I think, IMO, the stacktrace should start at main, and anything outside main is stuff the user doesn't care about

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:51):

At least, doesn't usually care about

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:51):

hmm

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:52):

fn main() {
    call_indirect(||  Err::<(), _>("").unwrap());
}

fn call_indirect<F: FnOnce()>(f: F) {
    f();
}

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:52):

would you expect this to have the call_once frame twice?

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:52):

once for the unwrap machinery and once for the call inside of call_indirect?

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:52):

I would have, but it doesn't

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:56):

I'm not sure of the rules for when call_once is actually invoked

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:56):

It doesn't seem to be for all closures

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:56):

Or rather it doesn't seem to be for closures at all, as in your example

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:57):

fn main() {
    call_indirect(foo);
}

fn foo() {
    Err::<(), _>("").unwrap()
}

fn call_indirect<F: FnOnce()>(f: F) {
    f();
}

Has two frames for call_once

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:57):

20:     0x560e8220a97b - core::ops::function::FnOnce::call_once::hd68444e2cc8c724f
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/ops/function.rs:227:5
  21:     0x560e8220a6ee - std::sys_common::backtrace::__rust_begin_short_backtrace::h4853b5a3266caa3c
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/sys_common/backtrace.rs:125:18
  22:     0x560e8220a771 - std::rt::lang_start::{{closure}}::h497f58961ebf7e05
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/rt.rs:49:18
  23:     0x560e8221b0f9 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h2aabc384aab89b7b
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/ops/function.rs:259:13
  24:     0x560e8221b0f9 - std::panicking::try::do_call::hc5fcacb7a85fc7b1
                               at /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:401:40

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:58):

oooooo

view this post on Zulip Nick Cameron (Aug 12 2021 at 22:58):

So I think it must only be invoked for named functions (like foo in this example, or main in the case of the stack trace stuff)

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:58):

yea

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:58):

that makes sense

view this post on Zulip Jane Lusby (Aug 12 2021 at 22:58):

that's interesting!

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:00):

and the foo frame is intact, so the call_once frame seems redundant

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:00):

yea, i say do it

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:00):

filter the shit out of those frames!

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:00):

:-)

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:03):

Do we apply this same filtering to std::backtrace::Backtrace?

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:05):

looks like we do

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:09):

acutally maybe not

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:11):

ah yea, capturing it manually will only see the __rust_begin_short_backtrace frame

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:12):

but not the __rust_end_short_backtrace frame

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:12):

and it completely ignores them / displays them normally regardless

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:14):

pub fn __rust_begin_short_backtrace<F, T>(f: F) -> T
where
    F: FnOnce() -> T,
{
    let result = f();

    // prevent this frame from being tail-call optimised away
    crate::hint::black_box(());

    result
}

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:14):

#[cfg(not(test))]
#[lang = "start"]
fn lang_start<T: crate::process::Termination + 'static>(
    main: fn() -> T,
    argc: isize,
    argv: *const *const u8,
) -> isize {
    lang_start_internal(
        &move || crate::sys_common::backtrace::__rust_begin_short_backtrace(main).report(),
        argc,
        argv,
    )
    .into_ok()
}

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:14):

so it's passing main in to a function expecting an FnOnce

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:15):

maybe we can get rid of the call_once by just changing __rust_begin_short_backtrace to take a function pointer

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:18):

I expect this will have to work differently from the current backtrace filtering

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:19):

right now its just filtering out the start and the stop of the backtrace, but never frames in the middle of the output

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:19):

and we cant filter out stuff related to Result using these named frames because then it would show up a bunch from user code

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:20):

what's unwrap_failed...

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:21):

oh, that's just to deal with monomorphization code bloat

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:21):

can that show up more than once in a stack frame?

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:21):

no right?

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:21):

like, you can only panic once

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:22):

otherwise, boom

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:22):

either way, that's gonna be somewhat error prone I expect, if we don't have the begin_short_backtrace function called in the centralized points where all panics flow through

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:23):

and instead have to call it inside of the various functions that can panick

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:23):

but it might not actually be an issue if __rust_begin_short_backtrace shows up more than once?

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:24):

I think it boils down to do we want to minimize the number of comparisons we have to do to filter out frames?

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:24):

alternatively, is there some magical compiler way we could mark a frame as unimportant

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:24):

"maybe we can get rid of the call_once by just changing __rust_begin_short_backtrace to take a function pointer" I think that would just use Fn::call rather than FnOnce::call_once

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:25):

Nick Cameron said:

"maybe we can get rid of the call_once by just changing __rust_begin_short_backtrace to take a function pointer" I think that would just use Fn::call rather than FnOnce::call_once

just tested, seems to work

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:25):

I think we would need a multi-pronged solution: I think we have an off by one error at the two short backtrace marker functions, which we can solve by effectively add ing +1/-1

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:25):

#![feature(backtrace)]

fn main() {
    // std::env::set_var("RUST_BACKTRACE", "full");
    let backtrace = std::backtrace::Backtrace::capture();
    println!("Backtrace:\n{}", backtrace);
    call_indirect(foo);
}

fn foo() {
    Err::<(), _>("").unwrap()
}

fn call_indirect(f: fn()) {
    f();
}

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:25):

what kind of type is fn()?!

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:26):

that's a function pointer

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:26):

not a trait object

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:26):

I tried f: &impl Fn

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:26):

yea, that's just a diff trait object

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:26):

lol

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:26):

the issue with function pointers is they can't be closures that capture anything

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:26):

because you can't have associated state

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:26):

I wonder if that would have back compat issues?

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:26):

i don't think so, because main has never been a closure

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:27):

and this is only ever called with main

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:27):

and it's outside of user code

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:27):

let me double check that second statement tho

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:27):

i wouldn't be surprised if test code uses this

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:27):

but that shouldn't be an issue

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:28):

ah shit

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:28):

this is used by std::thread

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:28):

so we'd have to make a second start marker

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:28):

one for function pointers that gets used at main and the old one for function trait objects that gets used by thread

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:29):

also used by the benchmark runner

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:29):

and test runner

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:29):

though those can probably switch to the function pointer version

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:29):

Nick Cameron said:

I think we would need a multi-pronged solution: I think we have an off by one error at the two short backtrace marker functions, which we can solve by effectively add ing +1/-1

thats an interesting approahc

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:30):

I'm vaguely worried about having different call graphs that can happen off of either of these two marker functions

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:30):

where in one situation it would have +1 extra frame and in the other it would have +2

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:30):

would want to be exceedingly careful to make sure we don't end up filtering out their main function or something

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:31):

or god forbid, the actual frame where the panic was created

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:31):

so we would probably want to add a set of tests that check the backtrace contains the expected start and end frames when a panic is invoked inside of a main function, a test, a benchmark, or a thread

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:32):

and I worry about a future path being added where the logic is wrong

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:36):

Yeah, I think you would have to change the internals of __rust_begin_short_backtrace to make the logic wrong, so that doesn't seem too risky (just needs a comment). Not sure about __rust_end_short_backtrace though. And yeah, tests would be necessary

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:36):

oh, i thought you meant something like change the backtrace print logic so it only starts printing the frame or two after it sees the __rust_begin frame

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:36):

I thinkthe other prong we could use would be to add an attribute to skip a function in backtraces. I think that would be useful for futures stuff too

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:37):

Nick Cameron said:

I thinkthe other prong we could use would be to add an attribute to skip a function in backtraces. I think that would be useful for futures stuff too

that does sound like the ideal solution

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:37):

especially if it lets users mark frames in their code as unimportant

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:37):

though I have no idea how hard it would be to implement something like that

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:38):

"oh, i thought you meant something like change the backtrace print logic so it only starts printing the frame or two after it sees the __rust_begin frame" - Yeah, that's what I meant, but I don't think it is easier to break the logic because the call_once comes from inside the _rust_begin_short_backtrace implementation

view this post on Zulip Nick Cameron (Aug 12 2021 at 23:38):

"though I have no idea how hard it would be to implement something like that" - yeah, I have no idea about the implementation :-)

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:44):

Nick Cameron said:

"oh, i thought you meant something like change the backtrace print logic so it only starts printing the frame or two after it sees the __rust_begin frame" - Yeah, that's what I meant, but I don't think it is easier to break the logic because the call_once comes from inside the _rust_begin_short_backtrace implementation

ah yea, the begin side seems easy to reason about

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:44):

i guess we just need to double check the end side

view this post on Zulip Jane Lusby (Aug 12 2021 at 23:45):

i would love to see the begin and end filtering also apply to std::backtrace::Backtrace


Last updated: Jan 26 2022 at 13:32 UTC