Stream: t-compiler/wg-mir-opt

Topic: eddyb - niko sync


nikomatsakis (Sep 04 2019 at 13:08, on Zulip):

So I've been chatting a bit with @eddyb about MIR optimizations and I realized we should be holding that conversation here.

nikomatsakis (Sep 04 2019 at 13:09, on Zulip):

The context is that there have been a lot of complaints about intermediate memcpy's being a particular kind of poor codegen, particularly coming from the stylo/servo projects, and I'm trying to drill down to what the specific steps are for us to improve that situation.

eddyb (Sep 04 2019 at 13:10, on Zulip):

I've also talked to a few people who've been waiting for these optimizations for years

nikomatsakis (Sep 04 2019 at 13:10, on Zulip):

OK, so so far we identified that we need to improve MIR debuginfo -- and @eddyb has a draft PR that tries to introduce a new representation based on (name, scope, place) tuples -- but that work needs to be completed. This would enable us to track the name of user-given things through optimizations.

eddyb (Sep 04 2019 at 13:10, on Zulip):

apologies notwithstanding, it would be good to actually get them in

nikomatsakis (Sep 04 2019 at 13:10, on Zulip):

You were then talking about SROA?

nikomatsakis (Sep 04 2019 at 13:10, on Zulip):

Say a bit more about that

nikomatsakis (Sep 04 2019 at 13:11, on Zulip):

I guess by SROA you mean taking a MIR literal like Foo { ... } and converting it into individual wri.... oh, no, you mean promoting the fields of a stack-allocated literal to distinct variables, of course

eddyb (Sep 04 2019 at 13:11, on Zulip):

first off, the tuple is not as clear as it could be. the printed syntax is:

scope {
    name => place
}
nikomatsakis (Sep 04 2019 at 13:12, on Zulip):

ok

nikomatsakis (Sep 04 2019 at 13:12, on Zulip):

I'm not sure which is clearer, but that makes sense

nikomatsakis (Sep 04 2019 at 13:13, on Zulip):

I've also talked to a few people who've been waiting for these optimizations for years

it would be really good to get specific examples -- I think maybe you mentioned you have some floating around?

eddyb (Sep 04 2019 at 13:13, on Zulip):

the difference with SROA is that you are replacing local.x.y.z with local_x_y_z so if you want debuggers to still show a composite image of local, from all of the locals it has been split in

nikomatsakis (Sep 04 2019 at 13:13, on Zulip):

one of my fears -- I consider it basically a certainty -- is that we'll do this work and then find it's not helping the actual example programs

eddyb (Sep 04 2019 at 13:13, on Zulip):

you also need projections (although more limited) on the LHS

nikomatsakis (Sep 04 2019 at 13:13, on Zulip):

(which just means we'll have to figure out what the next blocker is)

eddyb (Sep 04 2019 at 13:13, on Zulip):

I have examples I was testing on, and which were working :D

eddyb (Sep 04 2019 at 13:13, on Zulip):
pub struct Newtype<T>(T);
pub struct Huge(u64);
pub fn foo() -> Newtype<Huge> {
    let mut x = Huge(0);
    drop(&mut x);
    Newtype(Newtype(x)).0
}
eddyb (Sep 04 2019 at 13:14, on Zulip):

I think this is the most simplified version

nikomatsakis (Sep 04 2019 at 13:14, on Zulip):

(maybe we can collect those examples at some point in the mir-opt working group directory)

nikomatsakis (Sep 04 2019 at 13:14, on Zulip):

yeah, I think what would prob be good is (a) a reduced example and (b) a link to the original source and benchmark if possible

nikomatsakis (Sep 04 2019 at 13:14, on Zulip):

but that's great

eddyb (Sep 04 2019 at 13:15, on Zulip):

I have presented some of these before, so I guess that means they never got collected :(

nikomatsakis (Sep 04 2019 at 13:15, on Zulip):

not afaik

nikomatsakis (Sep 04 2019 at 13:15, on Zulip):

I'd love to be wrong

simulacrum (Sep 04 2019 at 13:15, on Zulip):

there's the docs from the all hands that had at least some of these with annotations?

nikomatsakis (Sep 04 2019 at 13:15, on Zulip):

the difference with SROA is that you are replacing local.x.y.z with local_x_y_z so if you want debuggers to still show a composite image of local, from all of the locals it has been split in

how is this represented in your proposed debuginfo?

nikomatsakis (Sep 04 2019 at 13:15, on Zulip):

yeah! I forgot about those

eddyb (Sep 04 2019 at 13:16, on Zulip):

so that's what I was getting to, it needs extending with:

scope {
    name (.field)* => place
}
nikomatsakis (Sep 04 2019 at 13:16, on Zulip):

there's the docs from the all hands that had at least some of these with annotations?

I guess you mean this MIR 2.0 dropbox paper

eddyb (Sep 04 2019 at 13:16, on Zulip):

the codegen for this is easy once MIR variable debuginfo is implemented, since you just need to use the "slicing" feature of DWARF (which IIRC LLVM exposes)

nikomatsakis (Sep 04 2019 at 13:17, on Zulip):

so that's what I was getting to, it needs extending with:

ok, that makes sense

eddyb (Sep 04 2019 at 13:17, on Zulip):

in fact, I think LLVM itself does this in its own SROA (lol)

eddyb (Sep 04 2019 at 13:17, on Zulip):

technically NRVO only depends on MIR var debuginfo

eddyb (Sep 04 2019 at 13:17, on Zulip):

while SROA requires that extension

nikomatsakis (Sep 04 2019 at 13:17, on Zulip):

OK, so, maybe we can consider that a distinct step? i.e., land the debuginfo you described before, then extract with more complex names?

eddyb (Sep 04 2019 at 13:18, on Zulip):

but we need both optimizations for most of the interesting cases (SROA runs first, to strength-reduce locals for NRVO)

eddyb (Sep 04 2019 at 13:18, on Zulip):

yupp!

eddyb (Sep 04 2019 at 13:18, on Zulip):

that has been the plan for like a year

eddyb (Sep 04 2019 at 13:18, on Zulip):

it just hasn't happened for reasons

eddyb (Sep 04 2019 at 13:18, on Zulip):

the PR is https://github.com/rust-lang/rust/pull/56231

nikomatsakis (Sep 04 2019 at 13:19, on Zulip):

OK

eddyb (Sep 04 2019 at 13:19, on Zulip):

I decided today that the best way forward is to actually introduce the representation in that PR in rustc_codegen_ssa alone

eddyb (Sep 04 2019 at 13:19, on Zulip):

and refactor the mess that is MIR codegen debuginfo interactions

eddyb (Sep 04 2019 at 13:19, on Zulip):

and then do that PR

nikomatsakis (Sep 04 2019 at 13:19, on Zulip):

so that would mean starting over (for now, at least)

nikomatsakis (Sep 04 2019 at 13:19, on Zulip):

yeah

nikomatsakis (Sep 04 2019 at 13:20, on Zulip):

ok, so once debug info work lands, what blocks us from doing the SROA optimization?

eddyb (Sep 04 2019 at 13:20, on Zulip):

well, I have never implemented the codegen bits

nikomatsakis (Sep 04 2019 at 13:20, on Zulip):

right, I just mean that it'd be a fresh PR

nikomatsakis (Sep 04 2019 at 13:20, on Zulip):

starting from master

nikomatsakis (Sep 04 2019 at 13:20, on Zulip):

maybe stealing bits from the other, I don't know

eddyb (Sep 04 2019 at 13:21, on Zulip):

nothing blocks us, the SROA PR has been around for even longer

nikomatsakis (Sep 04 2019 at 13:21, on Zulip):

one thing I can imagine

nikomatsakis (Sep 04 2019 at 13:21, on Zulip):

well, we can land the PR

nikomatsakis (Sep 04 2019 at 13:21, on Zulip):

but what might block us from enabling it on stable is the lack of any performance monitoring story for our generated code

eddyb (Sep 04 2019 at 13:21, on Zulip):

NRVO has some philosophical (in the absence of a proof assistant) and performance concerns

nikomatsakis (Sep 04 2019 at 13:21, on Zulip):

nothing blocks us, the SROA PR has been around for even longer

link?

eddyb (Sep 04 2019 at 13:22, on Zulip):

also the work @tmandry has been doing with generators has highlighted a few things

eddyb (Sep 04 2019 at 13:22, on Zulip):

about the analysis I was doing for NRVO

nikomatsakis (Sep 04 2019 at 13:22, on Zulip):

ok, so NRVO would be kind of a "next step" after SROA?

eddyb (Sep 04 2019 at 13:22, on Zulip):

and nowadays I think I'd favor implementing several analyses and cross-checking them via crater (if only that ran faster...)

eddyb (Sep 04 2019 at 13:23, on Zulip):

yeah, SROA is an easy decision IMO, and we could right away run constprop after it

eddyb (Sep 04 2019 at 13:23, on Zulip):

if certain fields are constant but others are not etc.

eddyb (Sep 04 2019 at 13:23, on Zulip):

https://github.com/rust-lang/rust/pull/48300

nikomatsakis (Sep 04 2019 at 13:25, on Zulip):

OK

eddyb (Sep 04 2019 at 13:26, on Zulip):

I guess not getting a contract early last summer threw a wrench into my plans, ugh

eddyb (Sep 04 2019 at 13:26, on Zulip):

(looking at the date on that PR)

nikomatsakis (Sep 04 2019 at 13:26, on Zulip):

do we think that SROA alone would have an impact on any real use cases?

eddyb (Sep 04 2019 at 13:26, on Zulip):

doubtful but possible if LLVM gets confused on them

eddyb (Sep 04 2019 at 13:27, on Zulip):

oh, you know where it would help? debug mode

eddyb (Sep 04 2019 at 13:27, on Zulip):

since LLVM's wouldn't run

nikomatsakis (Sep 04 2019 at 13:27, on Zulip):

I'm skimming over the notes from the all hands

eddyb (Sep 04 2019 at 13:27, on Zulip):

(and since we preserve debuginfo, we can have it on by default)

nikomatsakis (Sep 04 2019 at 13:28, on Zulip):

seems like "inlining" is another key optimization

nikomatsakis (Sep 04 2019 at 13:28, on Zulip):

that might help with debug mode compilation

eddyb (Sep 04 2019 at 13:29, on Zulip):

I suggest we enable inlining by default with a strict no-branch/no-loop heuristic, and small body otherwise

nikomatsakis (Sep 04 2019 at 13:29, on Zulip):

I guess there are kind of 3 possible goals around MIR optimizations:

eddyb (Sep 04 2019 at 13:30, on Zulip):

one exception for branches would be them being based off arguments in a way that would get simplified

nikomatsakis (Sep 04 2019 at 13:30, on Zulip):

I suggest we enable inlining by default with a strict no-branch/no-loop heuristic, and small body otherwise

iirc from when @Wesley Wiser was looking into this, we encountered some bugs in the existing inliner, not sure if they've been fixed

eddyb (Sep 04 2019 at 13:30, on Zulip):

heh

eddyb (Sep 04 2019 at 13:30, on Zulip):

also the existing inliner can't preserve debuginfo wrt. closures I think?

nikomatsakis (Sep 04 2019 at 13:30, on Zulip):

obviously, that can be done. just trying to enumerate work items and possible goals.

Wesley Wiser (Sep 04 2019 at 13:30, on Zulip):

The Inliner needs some (more) work. I did some work on it last year and found a bunch of basic issues with it but it's still doing something bad and causes ICEs if you try to bootstrap with it turned on. (https://github.com/rust-lang/rust/issues/63802)

eddyb (Sep 04 2019 at 13:31, on Zulip):

my PR already can, I think it's reflected in one of the testcase

nikomatsakis (Sep 04 2019 at 13:31, on Zulip):

yeah, so that too is kind of wanting to be integrated with your representation

nikomatsakis (Sep 04 2019 at 13:31, on Zulip):

actually that's another question

nikomatsakis (Sep 04 2019 at 13:31, on Zulip):

can we generate the proper debuginfo for inlined things?

eddyb (Sep 04 2019 at 13:31, on Zulip):

do you think I should focus on MIR? seemes easier than trying to do a million things

eddyb (Sep 04 2019 at 13:32, on Zulip):

nikomatsakis: can we generate the proper debuginfo for inlined things?

with my PR, variable debuginfo. but, hmpf

eddyb (Sep 04 2019 at 13:32, on Zulip):

inlining would break location information, unless we can encode it somehow

nikomatsakis (Sep 04 2019 at 13:32, on Zulip):

do you think I should focus on MIR? seemes easier than trying to do a million things

what does "focus on MIR" mean

eddyb (Sep 04 2019 at 13:32, on Zulip):

I wonder if @mw has thought about this enough

nikomatsakis (Sep 04 2019 at 13:33, on Zulip):

inlining would break location information, unless we can encode it somehow

I mean LLVM does it somehow

eddyb (Sep 04 2019 at 13:33, on Zulip):

focus on MIR vs, say, incremental, or other stuff

nikomatsakis (Sep 04 2019 at 13:33, on Zulip):

at least, when I build with debuginfo on linux, I get stacktraces that are aware of inlining

nikomatsakis (Sep 04 2019 at 13:33, on Zulip):

I see. Yes, I think we should focus on MIR optimizations

eddyb (Sep 04 2019 at 13:33, on Zulip):

(so far my priorities have been pretty out of whack)

nikomatsakis (Sep 04 2019 at 13:33, on Zulip):

Certainly I think we should focus on one thing

nikomatsakis (Sep 04 2019 at 13:33, on Zulip):

And I think they're a good choice, in part because I think we have a lot of "almost there" parts, and I think (as you said) there's a lot of pent-up demand

eddyb (Sep 04 2019 at 13:34, on Zulip):

DWARF can represent inlining but I'm not sure how much LLVM exposes. some people have looked into it for macros and it's tricky

eddyb (Sep 04 2019 at 13:34, on Zulip):

one thing I forgot to bring up is I'm thinking more and more about VSDG, since it models so many things so well

eddyb (Sep 04 2019 at 13:34, on Zulip):

and, while I don't want to outright propose it, there is something (R)VSDG does that is kind of hilarious in the historical context of MIR

nikomatsakis (Sep 04 2019 at 13:35, on Zulip):

I'm torn in that I think we should think about using the best representations, but I also don't want to let perfect be the enemy of the good.

eddyb (Sep 04 2019 at 13:35, on Zulip):

(R)VSDG is structured control-flow while MIR is a desugaring of structured control-flow into CFGs

eddyb (Sep 04 2019 at 13:35, on Zulip):

so a discussion we should maybe have at some point is: what would the MIR borrowck look like if it had to work with a structural replacement for CFG?

nikomatsakis (Sep 04 2019 at 13:35, on Zulip):

(Especially since we have LLVM to do the "heavy lifting")

nikomatsakis (Sep 04 2019 at 13:36, on Zulip):

Yeah, that also intersects polonius somewhat

nikomatsakis (Sep 04 2019 at 13:36, on Zulip):

I don't think it would be a big issue

eddyb (Sep 04 2019 at 13:36, on Zulip):

(let's say it's still blocky with mutable variables like today, but the overall graph is not an arbitrary CFG)

nikomatsakis (Sep 04 2019 at 13:36, on Zulip):

For the most part, the borrow check is a dataflow computation

nikomatsakis (Sep 04 2019 at 13:36, on Zulip):

The main problem MIR was intended to solve was that we had "two sources of truth"

nikomatsakis (Sep 04 2019 at 13:36, on Zulip):

(in the context of the borrow check, I mean)

nikomatsakis (Sep 04 2019 at 13:37, on Zulip):

that is, we kind of "desugared" once for borrow check, and then again for codegen, and they were different

eddyb (Sep 04 2019 at 13:37, on Zulip):

the reason is that you can sort of treat (R)VSDG like a pure term DAG, intern it, etc.

eddyb (Sep 04 2019 at 13:38, on Zulip):

but OTOH we have a bunch of information like SourceInfo that would probably defeat any interning-like MIR storage optimization

nikomatsakis (Sep 04 2019 at 13:38, on Zulip):

I have to admit that this sounds like kind of a big distraction :)

eddyb (Sep 04 2019 at 13:38, on Zulip):

yes, but OTOH one of the reasons I keep doing things other than MIR is because MIR is a huge headache of an IR

nikomatsakis (Sep 04 2019 at 13:38, on Zulip):

i.e., it sounds like you're proposing replacing MIR with something wholly different

eddyb (Sep 04 2019 at 13:39, on Zulip):

well, not necessarily. maybe there is a small change we can make to make it easier to reason about

nikomatsakis (Sep 04 2019 at 13:39, on Zulip):

Hmm. The structures weren't designed for editing, and places are too complex, but other than that?

eddyb (Sep 04 2019 at 13:39, on Zulip):

reasoning about state on a CFG is hard

eddyb (Sep 04 2019 at 13:40, on Zulip):

you have to constantly keep track of what you can trust and how well / under what conditions

nikomatsakis (Sep 04 2019 at 13:40, on Zulip):

What do you mean by "state" here

eddyb (Sep 04 2019 at 13:40, on Zulip):

local variables (plus the implicit global state that e.g. raw pointers and function calls can do arbitrary things to)

eddyb (Sep 04 2019 at 13:41, on Zulip):

but most things on the MIR are arguably pure-ish because Rust lends itself to a style of code which C doesn't

eddyb (Sep 04 2019 at 13:41, on Zulip):

I guess kind of how many locals are in SSA form but that has to be detected every single time

eddyb (Sep 04 2019 at 13:42, on Zulip):

anyway it is a distraction given the work already done

eddyb (Sep 04 2019 at 13:42, on Zulip):

(hence me proposing we find time to discuss it without delving into it. guess I failed :P)

nikomatsakis (Sep 04 2019 at 13:43, on Zulip):

yeah, ok, that's what I thought you probably meant.

nikomatsakis (Sep 04 2019 at 13:43, on Zulip):

I could certainly imagine moving towards a more SSA-like setup

eddyb (Sep 04 2019 at 13:44, on Zulip):

the logical conclusion of the data dependencies of SSA is encoding state dependencies as linear data dependencies

eddyb (Sep 04 2019 at 13:44, on Zulip):

then things are more like functional programs, but without higher-order CPS headaches

eddyb (Sep 04 2019 at 13:44, on Zulip):

linear state > CPS, effectively

nikomatsakis (Sep 04 2019 at 13:45, on Zulip):

still, it feels like we should try to get some working, shipping optimizations -- doing the refactorings we feel are needed for that, but not a lot more. I guess it just seems like we can leave the "heavy lifting" to LLVM for now, but even just doing relatively few, simple optimizations would get us the majority of benefit. And those are not that hard to express on a CFG

eddyb (Sep 04 2019 at 13:45, on Zulip):

(but we don't have to do any one thing in particular - I'd rather we find the changes we need keeping the advanced models in mind)

nikomatsakis (Sep 04 2019 at 13:45, on Zulip):

And then we'd have a better idea what our needs are

eddyb (Sep 04 2019 at 13:46, on Zulip):

one thing that is super easy, for example, is (R)VSDG intra-BB

eddyb (Sep 04 2019 at 13:47, on Zulip):

mostly means your statement list is replaced with a chain of stateful statements but anything "shallowly pure" forms more of a DAG around that

eddyb (Sep 04 2019 at 13:47, on Zulip):

which could be a dataflow performance boost, for example

nikomatsakis (Sep 04 2019 at 13:48, on Zulip):

jumping back to implementing and enabling optimizations, to what extent do you think that is blocked on refactoring MIR (and which refactors)

nikomatsakis (Sep 04 2019 at 13:48, on Zulip):

ignoring debuginfo

eddyb (Sep 04 2019 at 13:48, on Zulip):

SROA is easy, NRVO is annoying to analyze but also easy once you have the interference predicate

eddyb (Sep 04 2019 at 13:48, on Zulip):

both of them are ready (minus debuginfo), NRVO may require a heuristic threshold (it's quadratic so a lot of locals slows it down)

nikomatsakis (Sep 04 2019 at 13:49, on Zulip):

I'm thinking of things like:

eddyb (Sep 04 2019 at 13:49, on Zulip):

(the NRVO PR actually has some log of me doing speedups of many orders of magnitude or something just because I was optimizing the term under the quadratic :P)

nikomatsakis (Sep 04 2019 at 13:50, on Zulip):

SROA is easy, NRVO is annoying to analyze but also easy once you have the interference predicate

ok -- so they're not blocked on those kinds of changes. To what extent would they benefit from them?

nikomatsakis (Sep 04 2019 at 13:50, on Zulip):

I'm thinking of things like:

also, are there other things you would add to this list

eddyb (Sep 04 2019 at 13:50, on Zulip):

a flatter/simplest Place makes it easier to run dataflow, do replacements, etc.

eddyb (Sep 04 2019 at 13:51, on Zulip):

so it would be 1. slightly cleaner implementation 2. may run a bit faster

eddyb (Sep 04 2019 at 13:52, on Zulip):

also, are there other things you would add to this list

well, I don't know the full implications of it, wrt @RalfJ's models, but some sort of "alias" might be interesting

eddyb (Sep 04 2019 at 13:53, on Zulip):

like a raw pointer except not specifically raw or reference or Box

eddyb (Sep 04 2019 at 13:54, on Zulip):

and it would need to be borrowed to actually be turned into something you can access "dynamically"

eddyb (Sep 04 2019 at 13:54, on Zulip):

could just as well significantly complicate analysis though so I'm not sure

nikomatsakis (Sep 04 2019 at 13:54, on Zulip):

well, I don't know the full implications of it, wrt RalfJ's models, but some sort of "alias" might be interesting

can you elaborate a bit on that? is this intended as an intermediate value when desugaring complex places?

nikomatsakis (Sep 04 2019 at 13:55, on Zulip):

e.g., I've thought about -- as (maybe?) an intermediate step -- convering the MIR after borrowck so that complex places create intermediate values.

nikomatsakis (Sep 04 2019 at 13:55, on Zulip):

anyway, we're running up against my quota for this conversation. this was helpful.

eddyb (Sep 04 2019 at 13:56, on Zulip):

maybe, yeah

eddyb (Sep 04 2019 at 13:58, on Zulip):

a really cool thing is eqsat, but it requires something more advanced than CFG+SSA, (R)VSDG being one of the possible choices (you need to be able to share a lot of IR subgraphs)

it lets you run many optimizations on a "pick all the things that you can do, then do them" basis, without forcing an ordering between them, and assuming your rules are sound, it keeps making derivations (and you can have thresholds so you're not generating an infinity of equally useless equivalent programs)

RalfJ (Sep 15 2019 at 20:50, on Zulip):

@eddyb what do you mean?

Last update: Nov 17 2019 at 07:20UTC