Stream: t-compiler/major changes

Topic: Change Allocation error default behaviour compiler-team#366


triagebot (Sep 20 2020 at 18:23, on Zulip):

A new proposal has been announced: Change Allocation error default behaviour #366. It will be announced at the next meeting to try and draw attention to it, but usually MCPs are not discussed during triage meetings. If you think this would benefit from discussion amongst the team, consider proposing a design meeting.

XAMPPRocky (Sep 20 2020 at 18:26, on Zulip):

See also the discussion around hyper’s C API where this recently came up. https://github.com/hyperium/hyper/issues/2265

Josh Triplett (Sep 20 2020 at 18:27, on Zulip):

One notable constraint: if panicking because an allocation failed, the panic and unwind cannot allocate any memory.

Josh Triplett (Sep 20 2020 at 18:28, on Zulip):

Also, see https://github.com/rust-lang/rust/issues/66740 and https://github.com/rust-lang/rust/issues/66741 .

Jonas Schievink (Sep 20 2020 at 18:31, on Zulip):

This seems like a duplicate of https://github.com/rust-lang/rust/issues/66741

XAMPPRocky (Sep 20 2020 at 18:39, on Zulip):

Not quite, at least in the PR that’s purposed in that issue. That adds a new error handler for no_std environments that is panic. I’m proposing changing this line in std’s implementation from abort to panic. https://github.com/rust-lang/rust/blob/97eb606e4b2becd17d46a67d87169f52b210e67c/library/std/src/alloc.rs#L331

Jonas Schievink (Sep 20 2020 at 19:04, on Zulip):

Ah, okay. I was wondering why this was in t-compiler, but that makes sense as part of libstd implementation details.

simulacrum (Sep 20 2020 at 19:33, on Zulip):

@XAMPPRocky I think it would be helpful to include a description of why you think this is something T-compiler can change (i.e., why this is not a T-libs thread, basically)

simulacrum (Sep 20 2020 at 19:34, on Zulip):

I guess the argument is that it's not stabilized either way, but I imagine there needs to be some thought about this -- in particular, programs may be written such that they assume allocation failure would mean abort and would otherwise not be safe. (i.e., unsafe code could rely on this)

simulacrum (Sep 20 2020 at 19:35, on Zulip):

and really I would feel much better about changing this with more background etc on the topic :)

cc @Amanieu who I think has been sort of driving allocref and allocators in general recently

Amanieu (Sep 20 2020 at 19:38, on Zulip):

There are concerns that unsafe code may not handle "new" sources of unwinding properly.

Amanieu (Sep 20 2020 at 19:38, on Zulip):

Oh @simulacrum already mentioned that. I think that's really the main concern here.

XAMPPRocky (Sep 20 2020 at 19:39, on Zulip):

@simulacrum It’s an implementation detail. Someone could fork and distribute an API compatible version of rustc that panics by default rather than aborts.

simulacrum (Sep 20 2020 at 19:40, on Zulip):

Just saying "implementation detail" isn't really sufficient IMO, you need to justify why people cannot depend on this -- and it seems to me that we know of at least one type of code where people can. So you need to argue why breaking that code, potentially introducing UB, is okay / trade-off is worth it.

Amanieu (Sep 20 2020 at 19:40, on Zulip):

It's possible to work around the issue of unwinding needing to allocate memory by allocating an exception object on startup just in case of OOM. The C++ standard library already does this.

Amanieu (Sep 20 2020 at 19:41, on Zulip):

I would have to triple-check my unsafe code to see if it handles unwinding on alloc failure correctly. I think parking_lot isn't unwind-safe if allocation can unwind, but I'll have to double-check.

simulacrum (Sep 20 2020 at 19:42, on Zulip):

frankly it makes me feel very concerned about change here without some kind of opt-in or something on a relatively granular level if even "foundational" crates written by people likely aware of the possibility of this changing are potentially unsound with this

simulacrum (Sep 20 2020 at 19:43, on Zulip):

at the very least we'd need an audit of std, and that seems quite hard.

Amanieu (Sep 20 2020 at 19:44, on Zulip):

I do agree on principle that unwinding is a great way to handle OOM situations: the destructors will free any memory used by the "task" that caused the OOM, which usually allows the rest of the process to continue. Most OOM situations happen due to a single excessively large allocation (e.g. due to bad user input) rather than from actual address space exhaustion.

simulacrum (Sep 20 2020 at 19:45, on Zulip):

It'd be interesting for me to see the code size / performance impact of those additional unwinding edges

Amanieu (Sep 20 2020 at 19:47, on Zulip):

It's probably insignificant in a panic=unwind build since there are many functions that the compiler has to assume can unwind.

Amanieu (Sep 20 2020 at 19:50, on Zulip):

I had a quick look at parking_lot: I don't think it's unsound, but it will leave some locks permanently locked because the internal code uses manual lock/unlock instead of lock guards.

XAMPPRocky (Sep 21 2020 at 05:51, on Zulip):

@simulacrum Well I'll leave how to change the default up to the compiler team, but I could see this as an edition change to have that kind of "opt-in" you mentioned.

you need to justify why people cannot depend on this

This is not my motivation. If you wanted to depend on aborting for OOM errors you should be able to, this is more about it being the default. My motivation is that this default is very unintuitive for using Rust from other languages, and is counter to Rust's goals of having parity with C in terms of interop. With the current default you cannot ship stable libraries that don't abort on OOM, which makes Rust a non-starter for use in a lot of large mission critical systems, and even if you could requiring every Rust C FFI API to expose a unset_oom_abort or equivalent functionality to override the hook before use wouldn't be a good situation for library authors or users either.

simulacrum (Sep 21 2020 at 11:44, on Zulip):

My point is that the proposal as it stands doesn't talk about the - seemingly quite real, based on what @Amanieu says and my recollection of std - possibility of a change in defaults here introducing unsoundness or at least incorrect behavior into existing code.

Given that I'd personally say this needs a full RFC to discuss various tradeoffs it's making and lay out the rationale in more detail than the current short proposal.

nikomatsakis (Sep 21 2020 at 14:55, on Zulip):

In my opinion this kind of choice is not an implementation detail

nikomatsakis (Sep 21 2020 at 14:55, on Zulip):

I think that it should be a T-libs choice, with T-lang "kept informed"

nikomatsakis (Sep 21 2020 at 14:56, on Zulip):

At least if I'm understanding, this feels like a pretty user visible change

XAMPPRocky (Sep 21 2020 at 16:03, on Zulip):

Okay, I’m going to close this MCP for now then. Thanks for the feedback everyone!

Josh Triplett (Sep 21 2020 at 21:16, on Zulip):

FWIW, there was an interesting conversation about allocation in the Linux Plumbers Conference discussions around Rust in the Linux kernel. Linux has some different ways to allocate memory, and it may be important to handle allocation failure in better ways, as well as providing ways to affect how memory is allocated (e.g. "is it safe to go try to free up some memory before returning from the allocator").

Lokathor (Sep 21 2020 at 21:43, on Zulip):

Yeah, Rust certainly needs to improve in this area.

The fact that code might be unsound in the presence of unwinding allocation failure instead of aborting allocation failure really just means that such code needs to be found and fixed. It's essentially just "Not Acceptable" for abort-on-OOM to be the eternal situation. It's fine if this is a long term project that takes N years to transition instead of a 1 month PR, but we should slowly drive in this direction.

Particularly because you can change the oom hook (only on Nightly, yes yes) and then the unsound code would just plain become unsound. Having that outcome from two distant pieces of code both thinking they have enough knowledge of how a global thing works is just a bad time.

simulacrum (Sep 22 2020 at 00:09, on Zulip):

@Lokathor I definitely agree, especially given the oom hook on nightly, but I think my perspective is that yes we might want this but it needs more thought than an MCP can give, and likely RFC style feedback.

Lokathor (Sep 22 2020 at 00:10, on Zulip):

yeah yeah, probably the most an MCP could do here directly is found a group to begin the long term work and position us for the change later.

RalfJ (Oct 04 2020 at 11:12, on Zulip):

Also note that the answer is not as easy as "change it on the next edition" or "application-level opt-in"; neither of these do anything to protect against crates somewhere in the dependency tree getting this wrong.

Lokathor (Oct 04 2020 at 17:59, on Zulip):

Yeah, this absolutely can't be edition related.

Last update: May 07 2021 at 07:45UTC