Stream: project-error-handling

Topic: Structured Logging and One-Off Errors


view this post on Zulip Leonora Tindall (Oct 03 2020 at 18:22):

Hi all!

My employer CancerIQ currently uses Rust for all of our logic related to cancer risk assessment and clinical decision support. @Jane Lusby, your talk really inspired me to rethink the error handling story there, which I hugely appreciate. We're slowly moving from panic! everywhere and a global unwind catcher to anyhow and thiserror-based error types.

This is really great, obviously a huge improvement over the hacky system we were using before, but we still don't have a very good story around structured logging, which is really important for our other services. Right now, our Rust code is the only code that doesn't give out nicely formatted JSON logs. This is partly because our largest and most important service is using Iron 0.6, which has a rather poor logging and monitoring story, and partly because there's just not great support for structured logging in the logging ecosystem in general.

These are all problems we can solve, especially in that once Rocket 0.5 lands we will be able to move from Iron to Rocket (management doesn't like the risks involved with using a nightly Rust compiler for a production CDS service), and Rocket will have much more extensible logging. This just leaves us with the issue of how to get structured error data up to the level where it can be logged and returned with sufficient context.

With anyhow and thiserror, we generally define commonly-used error types with thiserror, such as errors that occur in our family graph logic and are common to many different pieces of logic. We then use the anyhow::Result type as the return type for the vast majority of functions and use .with_context to add context at each level. This leads to really well contextualized errors, and I can see my way towards implementing a semi-structured logging method using anyhow::Error::chain, but what I really want is a way to put in arbitrary key-value pairs along with a message and reconstitute them in a hierarchy at the top level. Obviously this is not a real API, but I'm thinking of something like:

foobar.frobnicate(baz)
    .with_context(|| error_vals!("the foobar failed to frobnicate the baz", foobar_version: foobar.version, baz_size: baz.capacity())?;

Which I could then reconstitute into a JSON object like this:

{
    "message": "the foobar failed to frobnicate the baz",
    "foobar_version": "2.3.2",
    "baz_size": 32,
    "cause": {
        "message": "foobar cache was inconsistent",
        "cache_size": 93120,
        // ... more stuff
    }
}

As I'm working in a non-embedded, memory-wide use case, I'm happy to allocate to do this; even just having a vector of tuples would be fine. My basic question is, is the direction things are moving likely to result in something that would support this as a possibility? Does this sound like a use case others are describing? If I were to build something like this for CancerIQ, would the error handling group be interested in that work?

Thanks a ton!

view this post on Zulip Jane Lusby (Oct 03 2020 at 18:34):

I think the generic member access RFC could in theory support this

view this post on Zulip Jane Lusby (Oct 03 2020 at 18:34):

you'd have a trait that does your structured logging, maybe serde Serialize

view this post on Zulip Jane Lusby (Oct 03 2020 at 18:35):

and you'd use the generic member access to upcast your errors from &dyn Error to &dyn Serialize

view this post on Zulip Leonora Tindall (Oct 03 2020 at 18:35):

Is that 2895?

view this post on Zulip Jane Lusby (Oct 03 2020 at 18:35):

you'd likely need to write your own contextualizing trait methods to replace with_context so you can create these dynamically rather than on a per error type basis

view this post on Zulip Jane Lusby (Oct 03 2020 at 18:35):

that's the one

view this post on Zulip Leonora Tindall (Oct 03 2020 at 18:39):

Okay, that makes sense! I definitely think that RFC would make this a great deal easier. I'm going to try to build a proof of concept at some point this weekend or next week and I'll share that here if that would be useful. Thanks a ton.

view this post on Zulip Jane Lusby (Oct 03 2020 at 18:45):

:heart:

view this post on Zulip Dan Johnson (Oct 03 2020 at 18:50):

I was wondering if you could emulate that by inserting fake entries in the source chain and using downcast to find them again. But it looks like there's no way to get anyhow to skip printing the fake entries when it's printing the source chain

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:00):

you can

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:00):

I wrote a library to do that actually

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:01):

https://crates.io/crates/extracterr

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:02):

@Dan Johnson thats because anyhow doesn't let you customize he report format

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:02):

which is why I forked it and wrote eyre

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:02):

which is anyhow but with custom report formats and context

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:02):

so for example color-eyre will look for entries that have a backtrace-rs backtrace and skip printing those

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:03):

and print that backtrace instead of capturing it's own

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:03):

and it works on stable

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:03):

this is also how tracing-error works

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:10):

but the pr to add that feature to anyhow is already open

view this post on Zulip Jane Lusby (Oct 03 2020 at 19:10):

were just making sure we're sure the approach in eyre is fully baked before we put it in anyhow

view this post on Zulip nagisa (Oct 03 2020 at 20:32):

If you stick with thiserror, simply implementing serde::Serialize could give you the means to make your errors into jsons easily. You'd store the actual kvs as values in the error enum variants or structs.

view this post on Zulip nagisa (Oct 03 2020 at 20:38):

FWIW adding context with thiserror types is usually not that much more difficult than with anyhow, instead of with_context(strings live here) you use map_err(Error::Variant).

view this post on Zulip Leonora Tindall (Oct 03 2020 at 20:58):

nagisa said:

FWIW adding context with thiserror types is usually not that much more difficult than with anyhow, instead of with_context(strings live here) you use map_err(Error::Variant).

Yes, this is what we do for those cases where we have two thiserror types, using the #[source] attribute. But we don't usually want to define error types unless we see a similar message or error condition being reported in three or more places, as we generally don't know enough about the error condition to abstract it into something useful, in general.

view this post on Zulip Leonora Tindall (Oct 03 2020 at 21:00):

To me, this feels like a kind of symmetry; thiserror gives us both simple enums without fields on the variants, which are mirrored by anyhow, and enum variants with fields, which currently have no equivalent in anyhow.

view this post on Zulip nagisa (Oct 03 2020 at 22:21):

I actively strive to make sure that my error variants are reused as few times as possible. Among other things it also gives me an ability to tell from the error message exactly what part of the codebase the error comes from. Trying to reuse variants goes directly to prevent that from happening. I did describe this somewhat here.

view this post on Zulip nagisa (Oct 03 2020 at 22:23):

but yeah I don't believe anyhow gives you a way to express what you want. I could see how it could be able to, but it seems sufficiently specific to a usecase like yours that it would make sense to make it a different crate entirely.

view this post on Zulip teor (Oct 04 2020 at 09:24):

It looks like the error API you want has a similar shape to the tracing logging macros. They let you attach arbitrary context and spans to log messages.

(This isn't a recommendation to use them in your project, tracing's logging doesn't really interact with error handling.)

view this post on Zulip Charles Ellis O'Riley Jr. (Oct 05 2020 at 22:47):

Will someone assign me to nicer assert messages? Thanks

view this post on Zulip Joshua Nelson (Oct 05 2020 at 22:59):

@Charles Ellis O'Riley Jr. you might be able to use @rustbot claim

view this post on Zulip DPC (Oct 05 2020 at 23:00):

well we still want i tto bemarked on who's working on what :stuck_out_tongue:

view this post on Zulip Jane Lusby (Oct 05 2020 at 23:00):

I'll mark it down Charles

view this post on Zulip Charles Ellis O'Riley Jr. (Oct 05 2020 at 23:01):

@Jane Lusby Thanks

view this post on Zulip Jane Lusby (Oct 05 2020 at 23:01):

np, and done :D


Last updated: Jan 26 2022 at 14:20 UTC