Stream: t-libs/wg-allocators

Topic: failable allocators and collections


Ryan Levick (Jun 11 2019 at 08:42, on Zulip):

Hey all. I've been searching online and trying to find the current state of "failable allocators" (specifically around collections) but I'm coming up empty. I need to be able to gracefully handle OOM errors. What is the current state of being able to do so? Is there a crate that exports the proposed "try" methods for collections (e.g., "try_push" for Vec)? Is it possible to write such a crate without having to reinvent the std lib collection implementations from scratch?

Mike Hommey (Jun 11 2019 at 08:50, on Zulip):

There's the mp4parse_fallible crate, which should be updated to use the now stable global allocator functions to reallocate...

Simon Sapin (Jun 11 2019 at 08:51, on Zulip):

https://github.com/rust-lang/rust/issues/46865#issuecomment-500241534 summarizes the status for fallible allocation and links relevant tracking issues

Simon Sapin (Jun 11 2019 at 08:52, on Zulip):

try_reserve is in Nightly. Presumably try_push can be implemented as self.try_reserve(1); self.push(x), and similarly for other methods

Simon Sapin (Jun 11 2019 at 08:53, on Zulip):

But on stable yeah, it’s gonna be copy-paste more than reinvent, but still

Simon Sapin (Jun 11 2019 at 08:55, on Zulip):

There’s https://github.com/servo/servo/tree/master/components/hashglobe, with a weirdly condescending readme

Ryan Levick (Jun 11 2019 at 08:57, on Zulip):

Hmmm... surprised no one has tried to come up with a crate that addresses this. Sounds like something the embedded working group would need a solution in the short term for (while waiting for the std implementation to stablize).

Ryan Levick (Jun 11 2019 at 08:59, on Zulip):

And RawVec is also unstable so yea... there would be quite a lot of copy/pasting...

Simon Sapin (Jun 11 2019 at 09:00, on Zulip):

@Ryan Levick “No one” has tried? We just just pointed out two crates…

Ryan Levick (Jun 11 2019 at 09:01, on Zulip):

Well I mean crates without the ominous "don't use this" warnings

Simon Sapin (Jun 11 2019 at 09:01, on Zulip):

https://github.com/servo/servo/blob/master/components/fallible/lib.rs is another. It adds Vec::try_push on stable

Simon Sapin (Jun 11 2019 at 09:01, on Zulip):

hashbrown has try_reserve on stable: https://docs.rs/hashbrown/0.4.0/hashbrown/struct.HashMap.html#method.try_reserve

Simon Sapin (Jun 11 2019 at 09:01, on Zulip):

we should probably move firefox/servo to that instead of hashglobe

Ryan Levick (Jun 11 2019 at 09:03, on Zulip):

Awesome - so it seems like maybe it wouldn't be to terribly hard to collect these implementations into a crate that provides extension traits for all std lib collections

Simon Sapin (Jun 11 2019 at 09:07, on Zulip):

In the hashbrown case the idea would be to use it instead of std HashMap

Mike Hommey (Jun 11 2019 at 09:07, on Zulip):

we should probably move firefox/servo to that instead of hashglobe

that, or move hashglobe to std::alloc::alloc

Mike Hommey (Jun 11 2019 at 09:08, on Zulip):

which servo's fallible module would need anyways

Simon Sapin (Jun 11 2019 at 09:08, on Zulip):

@Mike Hommey You mean stabilizing https://github.com/rust-lang/rust/issues/48043 ? IIRC that’s blocked on deciding the error type, but that’s probably solvable

Mike Hommey (Jun 11 2019 at 09:09, on Zulip):

No, I mean using std::alloc::alloc and std::alloc::realloc instead of the custom implementations in hashglobe

Simon Sapin (Jun 11 2019 at 09:12, on Zulip):

ah ok

Ryan Levick (Jun 11 2019 at 09:13, on Zulip):

Am I wrong in thinking that because std::alloc::{alloc,realloc} are stable and all collections expose their capacities and reallocate in known ways, it should be possible to create extensions for all collection types in a stable way?

Ryan Levick (Jun 11 2019 at 09:17, on Zulip):

In other words, there are well defined ways to allocate on behalf of a collection and because of this it makes it relatively trivial to check if that allocation failed (i.e., alloc or realloc returned a null pointer).

Mike Hommey (Jun 11 2019 at 09:18, on Zulip):

for collections that allow to take them apart, yes. That's what the mp4parse_fallible crate does, albeit not using std::alloc::{alloc, realloc} (filed https://github.com/mozilla/mp4parse_fallible/issues/5)

Mike Hommey (Jun 11 2019 at 09:20, on Zulip):

Vec has as_mut_ptr and from_raw_parts, which would allow that. HashMap doesn't.

Ryan Levick (Jun 11 2019 at 09:36, on Zulip):

Hmm... I wonder if there's a way to easily translate between a hashbrown HashMap and the std HashMap without reallocating since they're technically the same thing.... That way you could take advantage of hashbrown's ability to reallocate in a failable way

Simon Sapin (Jun 11 2019 at 09:38, on Zulip):

There’s an easy way: transmute. There is no good way IMO, std’s HashMap deliberately keeps its internal layout private.

Simon Sapin (Jun 11 2019 at 09:39, on Zulip):

But if you’re taking on a dependency on crates.io’s hashbrown anyway, why not use it instead of std’s HashMap and avoid conversions completly?

Ryan Levick (Jun 11 2019 at 09:39, on Zulip):

Yea, I don't know enough about the gurantees that hashbrown and std HasMap have to know if they can easily share buffers (so you don't have to rely on transmute)

Ryan Levick (Jun 11 2019 at 09:40, on Zulip):

simply for type capbility - if user's of the crate can extend std HashMap and use it instead of having to force users of the crate to accept hasbrown as an explicit dependency

Ryan Levick (Jun 11 2019 at 09:41, on Zulip):

presumably one could try to crate a crate that will eventually just be a thin veil over the newly stablized try_* methods

Ryan Levick (Jun 11 2019 at 09:42, on Zulip):

you can't do that if you explictly expose hasbrown as part of the public API

Ryan Levick (Jun 11 2019 at 11:17, on Zulip):

@Simon Sapin do you know specifically what is preventing stabilization of failable allocation APIs in collections?

Simon Sapin (Jun 11 2019 at 11:37, on Zulip):

@Ryan Levick I think it’s mostly the error type

Simon Sapin (Jun 11 2019 at 11:39, on Zulip):

The Layout of the allocation that failed should probably be in there

Simon Sapin (Jun 11 2019 at 11:43, on Zulip):

And, for when containers containers become generic over the allocator type, do we want the Alloc trait to have an associated error type? If so, CollectionAllocErr should probably be generic to contain that error type (together with the Layout). Or at least have a path so we can add that type parameter without breakage

Ryan Levick (Jun 11 2019 at 11:51, on Zulip):

Do you see any reason that an rfc shouldn't be able to decide this? ie are there any external issues that would prevent the discussion for reaching a conclusion? If not I might take this on

Simon Sapin (Jun 11 2019 at 12:07, on Zulip):

It depends. I think the three options are :

1. Propose to block try_reserve on the resolution of https://github.com/rust-lang/wg-allocators/issues/23 (should allocators be able to customize the error type?)
2. Propose not to block, and propose a design for CollectionAllocErr that allows both outcomes. (Where it it’s not generic at first, but can be generalized to a generic type that contains a custom allocator-type-specific error.)
3. Propose not to block, and that CollectionAllocErr is not and won’t be generic. try_reserve will simply ignore the custom error value return by the allocator.

Ryan Levick (Jun 11 2019 at 12:18, on Zulip):

OK those choices seem largely independent of other concerns. It seems like something that could be worth pursuing. Does anyone in the wg want to sponsor the effort (ie make themselves available for questions, reviews, etc)?

Simon Sapin (Jun 11 2019 at 12:31, on Zulip):

https://github.com/rust-lang/rust/issues/48043 is the place to discuss try_reserve

Simon Sapin (Jun 11 2019 at 12:57, on Zulip):

@Ryan Levick I wrote up a proposal https://github.com/rust-lang/rust/issues/48043#issuecomment-500828346

Simon Sapin (Jun 11 2019 at 12:57, on Zulip):

I basically picked # 2 above

Ryan Levick (Jun 11 2019 at 13:02, on Zulip):

Awesome thanks for pushing it forward! Having little context, #2 does sound ideal given its the most flexible and doesn't have any immediatly obvious large tradeoffs

Ryan Levick (Jun 11 2019 at 13:04, on Zulip):

Not sure how I feel about the name bikesheding. It feels right to open the possibility of having this serve box and potentially other smart pointers in the future but it does open the problem space up more. I wonder how bad it would feel to eventually have a similar error type outside of collections for smart pointers

Simon Sapin (Jun 11 2019 at 13:04, on Zulip):

(Re-reading through the tracking issue reminded me that it doesn’t have to be an enum, so that makes things easier)

Simon Sapin (Jun 11 2019 at 13:06, on Zulip):

It would likely be the exact same type, would would feel pretty bad to me. Renaming now sounds easier than introducing a new name later and deprecating the old one after it has been stabilized

Ryan Levick (Jun 11 2019 at 13:07, on Zulip):

OK can it live in the alloc crate? That seems like an obvious choice

Ryan Levick (Jun 11 2019 at 13:07, on Zulip):

And why ContainerAllocError and not AllocError?

Simon Sapin (Jun 11 2019 at 13:09, on Zulip):

We already have AllocErr which is used in the return types of methods of the Alloc trait. This one does not contain a Layout, because the immediate caller already knows what layout it just passed. Making it zero-size allows Result<NonNull<_>, AllocErr> to be pointer-sized.

Simon Sapin (Jun 11 2019 at 13:10, on Zulip):

We don’t so far export anything other than modules (and macros) at the root of standard library crates. So even in the alloc crate, we need to pick (or make) a module for this to live in

Simon Sapin (Jun 11 2019 at 13:11, on Zulip):

Maybe the alloc module

Ryan Levick (Jun 11 2019 at 13:11, on Zulip):

Yea that's what I had in mind

Ryan Levick (Jun 11 2019 at 13:12, on Zulip):

And the existence of an AllocErr that essentially is the same thing but does not contain layout information is a bit unfortunate. I think putting Container in front doesn't properly specify what is different about this type

Simon Sapin (Jun 11 2019 at 13:52, on Zulip):

There’s also the capacity-overflow case, where the container didn’t even go as far as calling alloc()

Simon Sapin (Jun 11 2019 at 13:52, on Zulip):

(or realloc())

Simon Sapin (Jun 11 2019 at 14:00, on Zulip):

So the two use cases are: an API for an allocator itself. Layout is a parameter of that API, so it doesn’t need to be in the return value. And a fallible method for some other library that happens to do allocation internally. It’s the library that does layout computation. This computation can overflow, but if it doesn’t and the allocation fails, then users want to know the Layout.

Simon Sapin (Jun 11 2019 at 14:02, on Zulip):

So yes, maybe Container is not the right word. But we do need two different error types. @Ryan Levick, do you have a naming suggestion?

Tom Phinney (Jun 11 2019 at 14:23, on Zulip):

AllocErrWithLayout would be self-descriptive, or AllocAndLayoutErr if Err must be the suffix?

Simon Sapin (Jun 11 2019 at 14:25, on Zulip):

But it sometimes contains a Layout

Simon Sapin (Jun 11 2019 at 14:26, on Zulip):

(Either in one of its own variants by being an enum, or by having an Option<Layout> field)

Ryan Levick (Jun 11 2019 at 17:18, on Zulip):

That's tough. I don't have any good suggestions. Maybe: AllocationRequestErr?

Last update: Nov 15 2019 at 10:15UTC