Stream: t-compiler/wg-mir-opt

Topic: turn on const prop by default


Wesley Wiser (Nov 06 2019 at 11:22, on Zulip):

Hi @oli, I hope you had a good vacation!

Now that ConstProp is in a bit better state, I think we should consider turning it on by default (it's currently gated under -Z mir-opt-level=2). I did a perf run recently and the numbers looked pretty decent https://github.com/rust-lang/rust/pull/66074

What do you think? There's a few changes left (some I have locally and some yet to do) before I think ConstProp is "done" but perhaps it's time to start talking about running it by default.

oli (Nov 06 2019 at 11:22, on Zulip):

:wave:

oli (Nov 06 2019 at 11:23, on Zulip):

I just did a double take

oli (Nov 06 2019 at 11:24, on Zulip):

"pretty decent" is a very humble way to describe this

oli (Nov 06 2019 at 11:24, on Zulip):

wow

Wesley Wiser (Nov 06 2019 at 11:24, on Zulip):

The cte stress tests are major outliers but yeah the rest looks good IMO

Wesley Wiser (Nov 06 2019 at 11:24, on Zulip):

:slight_smile:

oli (Nov 06 2019 at 11:25, on Zulip):

yea we should totally turn them on if optimizations are enabled. Though not in debug mode. I'm not sure how much mir optimizations and llvm optimizations are entangled

Wesley Wiser (Nov 06 2019 at 11:26, on Zulip):

Is your concern mangling debuginfo?

Wesley Wiser (Nov 06 2019 at 11:27, on Zulip):

(Debug perf.rlo results also show improvement so I'd love to get those compile time improvements as well)

oli (Nov 06 2019 at 11:27, on Zulip):

partially, but I also don't think we're running any opportunistic optimizations in debug mode right now

oli (Nov 06 2019 at 11:27, on Zulip):

hmm that's true though

Wesley Wiser (Nov 06 2019 at 11:27, on Zulip):

Yeah

Wesley Wiser (Nov 06 2019 at 11:28, on Zulip):

I think ConstProp is relatively safe in terms of the debugging experience because we don't (currently anyway) propagate away function calls

oli (Nov 06 2019 at 11:28, on Zulip):

I could create a compiler team FCP for it

oli (Nov 06 2019 at 11:29, on Zulip):

also we just prop temporaries

oli (Nov 06 2019 at 11:29, on Zulip):

so user code is not affected very much

Wesley Wiser (Nov 06 2019 at 11:29, on Zulip):

So beyond looking at the disassembly, the user shouldn't really be able to tell if, for example, let x = 2 + 2 is happening at runtime or compile time.

oli (Nov 06 2019 at 11:29, on Zulip):

jup

Wesley Wiser (Nov 06 2019 at 11:29, on Zulip):

We could do either of those yeah

oli (Nov 06 2019 at 11:30, on Zulip):

why did you choose to not prop pointers?

Wesley Wiser (Nov 06 2019 at 11:30, on Zulip):

heh

Wesley Wiser (Nov 06 2019 at 11:32, on Zulip):

So when I introduced ConstPropMachine, we lost the code that handles interning Memory. It happens to work ok for the mir-opt tests, probably because the tests are small and compilation is deterministic enough to end up with AllocIds from ConstPropMachine matching up exactly with the same ones from CompileTimeInterpreter. However, if I just switch on const prop without that check in should_const_prop, the bootstrap will ICE when compiling core.

Wesley Wiser (Nov 06 2019 at 11:32, on Zulip):

So I added that code to skip pointers to avoid the ICE

oli (Nov 06 2019 at 11:35, on Zulip):

oh, interesting. This could probably be resolved by interning all such constants before throwing away the InterpCx

oli (Nov 06 2019 at 11:35, on Zulip):

orthogonal though

oli (Nov 06 2019 at 11:36, on Zulip):

ok, let's go with integer-only const prop enabled even in debug mode for now and see how the rest of T-compiler likes it

oli (Nov 06 2019 at 11:36, on Zulip):

can you leave a comment explaining the no-pointer-strategy in the source code?

Wesley Wiser (Nov 06 2019 at 11:36, on Zulip):

Sure, yeah. That was just a throw-away commit to get the perf results.

Wesley Wiser (Nov 06 2019 at 11:37, on Zulip):

How does compiler team FCP work? Is that something we trigger on the PR?

oli (Nov 06 2019 at 11:40, on Zulip):

yea

Wesley Wiser (Nov 06 2019 at 11:40, on Zulip):

Ok, I'm adding some comments to that commit now and I'll push in a minute.

Wesley Wiser (Nov 06 2019 at 11:40, on Zulip):

Are you concerned about the slight regression in the codegen tests?

Wesley Wiser (Nov 06 2019 at 11:41, on Zulip):

I have some local changes that actually improve those to the point NO-OPT is equivalent to the other optimized cases.

oli (Nov 06 2019 at 11:51, on Zulip):

huh, why did no-opt optimize to 8 before this PR anyway?

Wesley Wiser (Nov 06 2019 at 11:53, on Zulip):

It only did in the test for inline(always)

Wesley Wiser (Nov 06 2019 at 11:53, on Zulip):

I think it's an artifact related to inlining an optimized function

oli (Nov 06 2019 at 12:00, on Zulip):

oh lol, llvm already const props in debug mode, you're right

oli (Nov 06 2019 at 12:00, on Zulip):

the inlining does the trick

oli (Nov 06 2019 at 12:00, on Zulip):

ok, so there's no reason not do do const prop in debug mode

oli (Nov 06 2019 at 12:01, on Zulip):

llvm does it, so we should, too

Wesley Wiser (Nov 06 2019 at 12:05, on Zulip):

Yeah, I think this change is also ok https://github.com/rust-lang/rust/pull/66074/files#diff-d3d378e46f2b79fa680365c64848f456R34

Wesley Wiser (Nov 06 2019 at 12:05, on Zulip):

lmk if you think otherwise

Wesley Wiser (Nov 06 2019 at 12:06, on Zulip):

I have a few commits locally that improve const prop and the simplify locals pass a bit more and actually get us to

// NO-OPT: ret i32 8

on these tests

oli (Nov 06 2019 at 12:25, on Zulip):

cool, we could merge that independently. About the %0.1 -> %3 changes, I can't figure out how the %3 actually happens, can you dump the entire llvm IR?

oli (Nov 06 2019 at 12:25, on Zulip):

I don't think the change is problematic, I just don't get what happened

Wesley Wiser (Nov 06 2019 at 13:43, on Zulip):

After:

define i32 @nothing() unnamed_addr #0 {
start:
  %_1 = alloca { i32, i8 }, align 4
  %0 = bitcast { i32, i8 }* %_1 to i32*
  store i32 4, i32* %0, align 4
  %1 = getelementptr inbounds { i32, i8 }, { i32, i8 }* %_1, i32 0, i32 1
  store i8 0, i8* %1, align 4
  %2 = bitcast { i32, i8 }* %_1 to i32*
  %3 = load i32, i32* %2, align 4
  ret i32 %3
}
oli (Nov 06 2019 at 13:48, on Zulip):

that seems like a pessimization :D

oli (Nov 06 2019 at 13:50, on Zulip):

oooh, I think I know what's going on. The ConstValue::ByRef of the tuple is not converted to an llvm tuple

oli (Nov 06 2019 at 13:51, on Zulip):

this seems like an easy fix in codegen that should improve all uses of such constants

Wesley Wiser (Nov 06 2019 at 14:10, on Zulip):

I don't really understand the cost model for LLVM instructions so I can't say. My thought was that before we actually did the add with overflow and branched on the result where as now we just put the correct values into variables and return the result from a variable load so it was probably faster?

oli (Nov 06 2019 at 14:26, on Zulip):

well. it's probably still cheaper than it was before, but the initial code we generate is a bit odd

oli (Nov 06 2019 at 14:27, on Zulip):

it computes pointers into the constant allocation and reads the tuple values out of it

oli (Nov 06 2019 at 14:27, on Zulip):

but llvm can work without an allocation by just creating a const tuple

oli (Nov 06 2019 at 14:27, on Zulip):

not sure what is better

oli (Nov 06 2019 at 14:27, on Zulip):

seems fine either way for the PR, but we should probably look into it

Wesley Wiser (Nov 06 2019 at 14:29, on Zulip):

oli: oooh, I think I know what's going on. The ConstValue::ByRef of the tuple is not converted to an llvm tuple

oli: this seems like an easy fix in codegen that should improve all uses of such constants

I've been wanting to learn more about LLVM so this sounds like an interesting project to me. Hopefully not too large? Do you think this would be something a newbie to LLVM could tackle?

Wesley Wiser (Nov 06 2019 at 14:31, on Zulip):

I need to check to be sure but I think some changes I have laying around my machine will change this to

define i32 @nothing() unnamed_addr #0 {
start:
  ret i32 8
}

but I will need to retest to be certain.

The main change is to allocate space for the MIR return value and allow propagating into that.

oli (Nov 06 2019 at 15:10, on Zulip):

I've been wanting to learn more about LLVM so this sounds like an interesting project to me. Hopefully not too large? Do you think this would be something a newbie to LLVM could tackle?

Yea this should be very self-contained. But I'm not sure that doing it is useless complexity considering that llvm optimizations will clean it up nicely

oli (Nov 06 2019 at 15:11, on Zulip):

we may be able to get rid of it by doing the field reading in const prop, too, so that may be simpler

Wesley Wiser (Nov 06 2019 at 15:18, on Zulip):

Ok, I'll work on upstreaming my other const prop changes first and we can see if it's still necessary after that.

RalfJ (Nov 06 2019 at 15:41, on Zulip):

Is your concern mangling debuginfo?

well we know debuginfo is broken for opt-level=2... https://github.com/rust-lang/rust/issues/66077

RalfJ (Nov 06 2019 at 15:42, on Zulip):

So I added that code to skip pointers to avoid the ICE

(about pointers)
so, uh, how reliable is that? contsprop results should obviously not be interned so I am not sure how that's related...
I guess I am asking, where is the ICE originating?

oli (Nov 06 2019 at 15:43, on Zulip):

contsprop results should obviously not be interned

that's not obvious to me

oli (Nov 06 2019 at 15:44, on Zulip):

if we have Foo { a: 42, b: 99, c: 52 } and we turn it into a single constant, why not intern its allocation?

RalfJ (Nov 06 2019 at 15:45, on Zulip):

I was thinking of temporary intermediate computations... hm

Wesley Wiser (Nov 06 2019 at 15:55, on Zulip):

Is your concern mangling debuginfo?

well we know debuginfo is broken for opt-level=2... https://github.com/rust-lang/rust/issues/66077

That's related to the MIR Inline pass which is still gated under -Z mir-opt-level=2 so I don't think it's really relevant to const prop.

so, uh, how reliable is that?

Should be totally reliable I believe unless I'm missing something :slight_smile:

https://github.com/rust-lang/rust/pull/66074/files#diff-9e103702275cbef342c2decd3395bf3bR627-R634

Wesley Wiser (Nov 08 2019 at 11:22, on Zulip):

Ok, I'll work on upstreaming my other const prop changes first and we can see if it's still necessary after that.

Looks like LLVM changes won't be necessary. With #66216, here's the LLVM ir that's generated:

define i32 @nothing() unnamed_addr #0 {
start:
  ret i32 4
}

:smiley:

Wesley Wiser (Nov 08 2019 at 11:22, on Zulip):

In debug, no optimization mode :tada:

oli (Nov 08 2019 at 11:23, on Zulip):

sweet

Wesley Wiser (Nov 11 2019 at 15:21, on Zulip):

Looks like most of the compiler team has signed off on turning on const prop by default. I don't know what the standard procedure for polls is. Should we wait until everyone signs off before merging?

oli (Nov 11 2019 at 17:43, on Zulip):

I think everyone but two is needed

Wesley Wiser (Nov 11 2019 at 17:45, on Zulip):

I suppose we're technically there but niko forgot to check his box off.

Wesley Wiser (Nov 11 2019 at 17:46, on Zulip):

(I'm assuming that's what @rfcbot reviewed means)

oli (Nov 11 2019 at 18:07, on Zulip):

jop

oli (Nov 11 2019 at 18:07, on Zulip):

I checked niko's box

oli (Nov 11 2019 at 18:07, on Zulip):

so

Wesley Wiser (Nov 11 2019 at 18:09, on Zulip):

I will rebase this tonight then and update the tests to reflect the final state so you can review.

Last update: Nov 17 2019 at 06:55UTC