Stream: general

Topic: re-visiting placement new


Jake Goulding (Jan 23 2020 at 00:32, on Zulip):

Do any recent changes (MaybeUninit, raw references, maybe even Pin?) make implementing placement new more feasible nowadays?

nagisa (Jan 23 2020 at 00:37, on Zulip):

not really. What we need an optimisation that guarantees construction of data into a location and once we do have that you wouldn't need a separate function for this.

Jake Goulding (Jan 23 2020 at 00:40, on Zulip):

Would such an optimization work in all cases, e.g. even for collections?

nagisa (Jan 23 2020 at 00:47, on Zulip):

It would have to if we wanted placement-in.

nagisa (Jan 23 2020 at 00:48, on Zulip):

The original placement in just relied on having a *mut T to put a T into

nagisa (Jan 23 2020 at 00:48, on Zulip):

the problem is that forcing an arbitrary expression to put its data directly into the pointer rather than onto stack is… non-trivial

nagisa (Jan 23 2020 at 00:49, on Zulip):

i.e. the problem is the right hand side, not the left hand one.

centril (Jan 23 2020 at 00:50, on Zulip):

(I think "guaranteed optimization" is sort of a contradiction in terms. Optimizations are definitionally about improving code in some ways while preserving as-if. If as-if is not preserved it is not an optimization.)

nagisa (Jan 23 2020 at 00:50, on Zulip):

right.

centril (Jan 23 2020 at 00:51, on Zulip):

Whatever guarantees we make have to, in my view, be backed up by operational semantics, but should also be simple enough to specify (for the spec) and understand (for users).

nagisa (Jan 23 2020 at 00:51, on Zulip):

rather than it being optimisation, we need a way to guarantee that we can make *t = EXPR put its result into the location pointed to by pointer for arbitrary EXPRs.

nagisa (Jan 23 2020 at 00:52, on Zulip):

and that’s… not feasible AFAICT anyway

nagisa (Jan 23 2020 at 00:52, on Zulip):

without changing how EXPRs work, which is what C++ did.

centril (Jan 23 2020 at 00:53, on Zulip):

@nagisa my understanding is that C++'s guarantees were not operational either

nagisa (Jan 23 2020 at 00:55, on Zulip):

I can’t think of an example that would disprove it, tbh

centril (Jan 23 2020 at 00:55, on Zulip):

disprove what?

nagisa (Jan 23 2020 at 00:56, on Zulip):

Disprove that their guarantees are operational.

nagisa (Jan 23 2020 at 00:56, on Zulip):

unless we’re talking about different things.

nagisa (Jan 23 2020 at 00:57, on Zulip):

I mean, yes it is true that C++ requires the programmer to write code in a very particular way if they want the guarantee, but once they do it, they have the guarantee.

nagisa (Jan 23 2020 at 00:57, on Zulip):

(whether people write their code that way is a different question altogether)

centril (Jan 23 2020 at 01:08, on Zulip):

At least for Rust, we would need a way to observe the difference with and without the guarantee

simulacrum (Jan 23 2020 at 01:45, on Zulip):

Well, box [0; 1_000_000] is going to blow your stack, but with placement new won't, right?

simulacrum (Jan 23 2020 at 01:45, on Zulip):

but maybe not the best example :)

centril (Jan 23 2020 at 02:00, on Zulip):

Does Miri have a concept of a stack in this sense?
Using "stack overflow" as the observable distinction would also make this guarantee subject to the things already in the stack (in memory) and the size of the new allocation

simulacrum (Jan 23 2020 at 02:05, on Zulip):

I'm not sure why being able to observe something (in the sense of being able to nail down that observation) would need to be in the spec. Loosely, I would expect that placement construction boils down to "space is not allocated by the compiler" or something along those lines -- i.e., that the compiler is not permitted to place values outside the region provided during the construction of the object.

simulacrum (Jan 23 2020 at 02:05, on Zulip):

That might be too strict, I'm not sure.

centril (Jan 23 2020 at 02:58, on Zulip):

I don't think the spec should be specified in terms of a compilation model. Even the word "compiler" is suspect in the spec imo.

Lokathor (Jan 23 2020 at 08:04, on Zulip):

It seems like a lot of it can be half-solved with maybeuninit if you're willing to write code that sticks to a convention

rkruppe (Jan 23 2020 at 10:46, on Zulip):

The convention that is needed for this is a very invasive and painful one, where you can't use a ordinary expressions or APIs to produce a significant amount of data and instead have to rewrite all the code involved in producing the values you want to write them to the destination piecewise. This is hardly a solution for the placement problem, similar (though to a lesser extent) to how "write assembly" is hardly a solution for missed optimizations.

Amanieu (Jan 23 2020 at 14:41, on Zulip):

C++'s guarantee are specifically about copy/move constructor elision: you can observe the difference because your constructor is not getting called. The C++ standard makes no guarantees on stack usage.

simulacrum (Jan 23 2020 at 14:47, on Zulip):

I wonder if there's some tie-in to Pin's guarantees about movement that we could utilize. Nothing springs to mind though.

pnkfelix (Jan 24 2020 at 02:40, on Zulip):

I wonder if there is a way to specify it operationally via an asymptotic space efficiency argument, a la Clinger 1998. Unfortunately I don’t know if he still has a public copy on a website

simulacrum (Jan 24 2020 at 02:41, on Zulip):

@pnkfelix https://citeseer.ist.psu.edu/viewdoc/download;jsessionid=0FA23CCE7E9A17539134E7CFCD0CC6A0?doi=10.1.1.50.4500&rep=rep1&type=pdf?

pnkfelix (Jan 24 2020 at 02:42, on Zulip):

Good work!

Lokathor (Jan 24 2020 at 07:18, on Zulip):

@rkruppe I'm not clear on how you think it is so invasive. Maybe you're not imagining the same thing as me?

You can get "pseudo placement new" with just additional methods added to exiting types. If you give all your types a function that takes &mut MaybeUninit<Self> and initializes it, then you can construct in place. Of course, the biggest drawback is that you can't access the fields of MaybeUninit at all easily. The second biggest drawback is that it's unsafe as all get out, so there's that. The third problem is of course that adding one method for each type you want to support "pseudo placement new" with doesn't scale very well. I'm not saying that a language level change isn't necessary, but I think that you can get a lot of the desired effect with MaybeUninit and/or zeroed().

However, I also haven't had to do much with the "don't let it touch the stack" problem. In the cases that I've faced personally, you can just alloc_zeroed and turn the pointer into a Boxed array and call it a day. So maybe I don't even understand the problem properly.

rkruppe (Jan 24 2020 at 11:22, on Zulip):

No, that's pretty much what I mean too. Except it isn't just one method per type, it's extra code per "way to move a value". For example, if you want to have two Vec<HugeStruct> and want to take the last element of one Vec and push it onto the other Vec with minimal moves (and stack usage), you don't just need a placement-aware variant of Vec::push, you also need to reimplement Vec::pop to place the popped value into the destination directly. Open-coded that might look something like this:

vec1.push_emplace(|dest| {
    dest.as_mut_ptr().copy_from(vec2.last_mut().unwrap());
    vec2.set_len(vec2.len() - 1);
});

In this example the blowup is not so terrible, but in general if you have a big(complicated(expression())) resulting in a large type T, you may end up having to rewrite all code transitively used by that expression to have an out-argument instead. Sometimes that is unavoidable anyway, but in many cases the compiler ought to be able to generate the right code by just writing the results of expressions directly into their eventual destination without any temporaries.

RalfJ (Feb 01 2020 at 16:19, on Zulip):

I am not sure if "operational guarantees" is the right benchmark here... sounds to me more like designing an opsem that is simple to compile efficiently. that's like how we don't guarantee that i + 1 is compiled to an addition instruction (plus potential overflow check) instead of something ridicolously inefficient.

centril (Feb 02 2020 at 16:14, on Zulip):

Depends on whether best-effort optimizations like any others are sought after or if it is stability guarantees. The latter would, in my view, require something operational, whereas the latter is "just" a quality of implementation which the compiler team can tweak (and regress, improve again, ...) as they see fit.

Lokathor (Feb 02 2020 at 18:40, on Zulip):

The people who have spoken to me about this being a requirement at all usually speak as if it's a hard requirement. If you use the Placement New then the generated code must have the transformation applied.

Of course they could be exaggerating, but I think people want an absolute assurance.

nagisa (Feb 02 2020 at 19:43, on Zulip):

Yes, otherwise the code in question will just hit stack overflow 100% of the time.

nagisa (Feb 02 2020 at 19:44, on Zulip):

ain’t good enough if applying this pattern would just reduce the chance you get an overflow

Elichai Turkel (Feb 03 2020 at 10:06, on Zulip):

Well, box [0; 1_000_000] is going to blow your stack, but with placement new won't, right?

well this already works now (even when adding a zero).

I agree with @centril I don't know of any language that promises if new placement will be directly on the heap, or even that it will be on the heap.
I think we should do our best that the compiler will actually optimize it to be directly on the heap (ideally even without the box syntax) but not promise that.

Wodann (Feb 03 2020 at 13:37, on Zulip):

Placement new (in C++) merely allows you to instantiate an object directly in the provided memory location. As such this is a hard guarantee that no additional memory allocations will be used (no matter whether the provided memory location is on the heap or on the stack). [search for placement new in https://en.cppreference.com/w/cpp/language/new ]

kennytm (Feb 03 2020 at 13:56, on Zulip):

IMO if you can guarantee that box Struct { fields }, box Struct(fields) (including Enum::Variant(fields)) and box [x; n] will not blow up the stack it already covers 99% use case of placement new

nagisa (Feb 03 2020 at 14:04, on Zulip):

box is effectively what the old placement new was

nagisa (Feb 03 2020 at 14:05, on Zulip):

As thus it shares all of the caveats.

kennytm (Feb 03 2020 at 14:07, on Zulip):

you know how it should generalize to <- :upside_down: i don't think it worth the effort to guarantee for arbitrary expr

kennytm (Feb 03 2020 at 14:08, on Zulip):

if someone writes box f(x) (HEAP <- f(x)), i don't think we can avoid occupying the stack

nagisa (Feb 03 2020 at 14:10, on Zulip):

We could guarantee it for at least extern "Rust" stuff if we adapt ABI (to e.g. always make caller allocate location for return values)

nagisa (Feb 03 2020 at 14:11, on Zulip):

but if we do that, we no longer need a separate syntax for this

kennytm (Feb 03 2020 at 14:22, on Zulip):

but if we do that, we no longer need a separate syntax for this

you mean vec.push(f(x)) would be made equivalent to vec <- f(x) with such ABI?

nagisa (Feb 03 2020 at 14:57, on Zulip):

well, not that.

nagisa (Feb 03 2020 at 14:57, on Zulip):

because that introduces a semantic difference in operation

nagisa (Feb 03 2020 at 14:57, on Zulip):

/me takes everything back and throws it into the pit

nagisa (Feb 03 2020 at 14:58, on Zulip):

but that ABI particularity is a precondition for vec <- f(x) to work reliably if at all

rkruppe (Feb 03 2020 at 18:32, on Zulip):

The ABI changes necessary to make place <- f(x) work as intended (modulo whatever happens inside of f) are actually another good reason to not pursue a strict guarantee of "no moves" (however this is formalized) but treat it as a QoI matter. A function returning e.g . an int should obviously be able to return it by value. Direct emplacement of return values is only useful for large types, where "large" is somewhat vague but certainly past the point where a "write the result through this out-pointer" ABI is desired anyway => we don't really need ABI changes, just RVO exploiting the existing ABI.

rkruppe (Feb 03 2020 at 18:37, on Zulip):

More generally, making this a QoI matter and being good enough at it (much more consistent than we are today) shouldn't really make a difference for people worried about blowing up their stack: many other things that affect stack usage (e.g. amount of inlining, effectiveness of stack coloring, spills during register allocation) are also not guaranteed, but in practice these are rarely pathological enough to cause stack overflows and when they do it's often due to bugs we'd want to fix anyway. I'd like us to get to the same point with carefully-written Rust: if you adhere to a few conventions while constructing large objects in otherwise natural ways (i.e., not rewriting everything to have explicit out-pointers), the stack should not blow up unless a whole sequence of very unlikely things happened.

RalfJ (Feb 03 2020 at 20:39, on Zulip):

The people who have spoken to me about this being a requirement at all usually speak as if it's a hard requirement. If you use the Placement New then the generated code must have the transformation applied.

Of course they could be exaggerating, but I think people want an absolute assurance.

I bet the same people say it is a hard requirement that i+j doesnt use up 1MB of stack space

RalfJ (Feb 03 2020 at 20:40, on Zulip):

and yet our spec says nothing like that, and it would be very hard to make it do that (lets get a spec first before making it awesome like that)

RalfJ (Feb 03 2020 at 20:40, on Zulip):

so, I dont see how focusing on a "hard operational requirement" is helping the slightest here

RalfJ (Feb 03 2020 at 20:40, on Zulip):

what matters is that in practice, the compiler will never make the stack blow

RalfJ (Feb 03 2020 at 20:41, on Zulip):

just like in practice, the compiler wont do ridiculous things with additions

RalfJ (Feb 03 2020 at 20:41, on Zulip):

I agree though that "the compiler can probably optimize this" is not good enough -- it needs to be structural, as easy as it is to compile addition. that's why I spoke of designing an opsem that can be efficiently compiler (in all cases).

RalfJ (Feb 03 2020 at 20:45, on Zulip):

'd like us to get to the same point with carefully-written Rust: if you adhere to a few conventions while constructing large objects in otherwise natural ways (i.e., not rewriting everything to have explicit out-pointers), the stack should not blow up unless a whole sequence of very unlikely things happened.

that still sounds somewhat unsatisfying though... something where no optimizations are needed but the structure of the code is such that in-place init is the natural way to compile things would be better, IMO. but this is an outsider opinion, I havent delved into all the hard questions here (nor do I have time to do that)

rkruppe (Feb 03 2020 at 20:50, on Zulip):

I think we're actually in agreement, just approaching it from different directions. I'm also hedging my wording because no matter what we do, there will always be expressions that require huge temporaries to evaluate (but this should be predictable from examining the expression and the functions it calls), and there will also also expressions that run out of stack space despite being compiled in a way that does not involve any huge temporary (but e.g. perhaps a lot of small stack allocations that add up).

RalfJ (Feb 04 2020 at 14:52, on Zulip):

@rkruppe

I think we're actually in agreement, just approaching it from different directions

I can believe that :)

Charles Lew (Feb 06 2020 at 15:38, on Zulip):

I want to imagine there's a macro called init!, let me show it by example:
assume we have

struct S{ a: u32 }
struct T { s: S, t: [u32; 5] }
fn foo(x: bool) -> u32 { ... }

when there's a let mut v = MaybeUninit::uninit();, we can call
init!(v, T { s: S { a: f1 }, t: [f2; 5] }, f1 = foo(true), f2 = foo(false)); to fully initialize it.

the first parameter is a MaybeUninit<T>, the second parameter describes the "structure" of the init value, and the rest parameters are named parameters that got evaluated on the stack.

After this we can wrap it into a macro box!(S {a: f1}, f1 = foo(true)) that provides the functionality of placement new and deprecate the box operator :)

Last update: Jun 07 2020 at 10:35UTC