Stream: t-compiler/wg-polonius

Topic: fixing the `Location::All` OOMs during fact generation


lqd (Aug 04 2020 at 00:04, on Zulip):

so I've pushed part of my prototype for this simple rustc change (points to an outdated rev of the polonius branch, it's just to show the difference in fact generation), polonius branch

lqd (Aug 04 2020 at 00:05, on Zulip):

instead of materializing each such constraint at all points of the CFG here, we split those into a dedicated relation, outlives_everywhere (naming TBD) and duplicate the rules, while taking care of hopefully not materializing the same data in polonius itself

lqd (Aug 04 2020 at 00:06, on Zulip):

(the rustc branch deletes the same 2 tests which used to OOM during fact gen as they currently OOM in polonius but that's not a big problem, it's mostly to show the approach)

lqd (Aug 04 2020 at 00:10, on Zulip):

here's an example of the rules duplication in the location-insensitive variant
while there are 2 cases in the naive variant: for the subsets & requires relations

lqd (Aug 04 2020 at 00:11, on Zulip):

I'll push the more complicated changes, for the Opt variant, tomorrow

lqd (Aug 04 2020 at 00:15, on Zulip):

rustc tests are still running, and since it's also been a long time since I've updated the test expectations I'll need to verify the baseline on nightly, but that reminds me of the earlier question about the duration of the tests using the Naive variant (so, the worst case). This run had a lot more "this test has been running for more than 60 seconds" than usual, there were also more really high memory usage (and possibly one or two OOMs) but overall it looked like the ui tests were twice as slow

lqd (Aug 04 2020 at 00:17, on Zulip):

(let's be clear, a lot of those tests would be fast-tracked by the location-insensitive analysis in a realistic situation)

lqd (Aug 04 2020 at 01:18, on Zulip):

tests are done, the failures are the same in both cases, with and without the OOM fix

nikomatsakis (Aug 04 2020 at 14:17, on Zulip):

"twice as slow" for UI tests doesn't seem that bad to me

nikomatsakis (Aug 04 2020 at 14:17, on Zulip):

particularly if the location insensitive can do better

nikomatsakis (Aug 04 2020 at 14:17, on Zulip):

still, given that I think we are not going to do the equality variant for now

nikomatsakis (Aug 04 2020 at 14:18, on Zulip):

I'm feeling a bit less "ardor" for removing the optimized variant :P

nikomatsakis (Aug 04 2020 at 14:19, on Zulip):

I'm looking at this now

lqd (Aug 04 2020 at 14:20, on Zulip):

ah :)

lqd (Aug 04 2020 at 14:21, on Zulip):

there's a couple cases in there which are very slow but it's probably move errors in another test, I'm not worried even if it skews the duration

lqd (Aug 04 2020 at 14:21, on Zulip):

it'll get better once we fix move errors, and of course once we finish adding subset errors to the opt variant we should be back in business

lqd (Aug 04 2020 at 14:22, on Zulip):

I'd like to get that done today, just so we can have numbers knowing that move errors haven't changed compared to that 2x run

nikomatsakis (Aug 04 2020 at 14:22, on Zulip):

the duplication for the naive rules is not so bad

nikomatsakis (Aug 04 2020 at 14:22, on Zulip):

i'm not sure what it will take for the optimized rules

lqd (Aug 04 2020 at 14:23, on Zulip):

it's worse there

lqd (Aug 04 2020 at 14:23, on Zulip):

and it multiplies when you combine subset errors + subset_everywhere :3

nikomatsakis (Aug 04 2020 at 14:23, on Zulip):

I feel like one of the my "jobs" that I've never quite completed is to write out the optimized rules

nikomatsakis (Aug 04 2020 at 14:23, on Zulip):

yeah

nikomatsakis (Aug 04 2020 at 14:23, on Zulip):

I wonder if there would be a way to extend datafrog instead

lqd (Aug 04 2020 at 14:24, on Zulip):

it's not terribly worse, but that variant is already complex so yeah

nikomatsakis (Aug 04 2020 at 14:24, on Zulip):

i.e., to support sets of tuples that are not directly stored in vectors

lqd (Aug 04 2020 at 14:24, on Zulip):

yeah we probably could/should, or generate more of them

lqd (Aug 04 2020 at 14:25, on Zulip):

(we'd still probably need to augment some joins to ask for a leapjoin to be generated, and the likes, but the rules themselves are generally more palatable)

nikomatsakis (Aug 04 2020 at 14:26, on Zulip):

I'm trying to thikn exactly what I mean

lqd (Aug 04 2020 at 14:26, on Zulip):

but the Opt one is already complex as it is (and I'm not sure I'm aware of all its subtlety :scream:)

nikomatsakis (Aug 04 2020 at 14:26, on Zulip):

if this were prolog, not datalog, I suspect we'd have a rule like

nikomatsakis (Aug 04 2020 at 14:26, on Zulip):
subset_base(O1, O2, N) :- subset_base_at(O1, O2, N); subset_base_everywhere(O1, O2)
nikomatsakis (Aug 04 2020 at 14:27, on Zulip):

(but that rule doesn't fit datalog because N is unbound in the second half)

lqd (Aug 04 2020 at 14:28, on Zulip):

if this were prolog, higher-ranked sub origins would love it

nikomatsakis (Aug 04 2020 at 14:28, on Zulip):

I do wonder, in terms of performance

nikomatsakis (Aug 04 2020 at 14:28, on Zulip):

i.e., if we were working backwards --

nikomatsakis (Aug 04 2020 at 14:29, on Zulip):

starting with each invalidated loan that also appears in the location insensitive errors --

nikomatsakis (Aug 04 2020 at 14:29, on Zulip):

who knows...

lqd (Aug 04 2020 at 14:29, on Zulip):

that's also why I was interested in demand transformation as it kinda simulates top-down evaluation from bottom-up rules

nikomatsakis (Aug 04 2020 at 14:29, on Zulip):

ultimately propagated forwards is generally better if you can do it and you can narrow the targets enough (as we've talked about)

nikomatsakis (Aug 04 2020 at 14:29, on Zulip):

right

nikomatsakis (Aug 04 2020 at 14:30, on Zulip):

I could take a stab at augmenting the optimized variant

nikomatsakis (Aug 04 2020 at 14:30, on Zulip):

I'm kind of at a loss for what to do besides distract y'all right now :)

lqd (Aug 04 2020 at 14:30, on Zulip):

filtering based on this idea is also a big win (for cases which are already inefficient at least)

lqd (Aug 04 2020 at 14:31, on Zulip):

augmenting to handle the subset_everywhere relation you mean ?

nikomatsakis (Aug 04 2020 at 14:31, on Zulip):

right

nikomatsakis (Aug 04 2020 at 14:31, on Zulip):

and/or trying to write-up how it works

nikomatsakis (Aug 04 2020 at 14:31, on Zulip):

...I mean the key idea is to delay computing TC until we remove something...

nikomatsakis (Aug 04 2020 at 14:31, on Zulip):

...which brings back to mind another thought I've had

lqd (Aug 04 2020 at 14:31, on Zulip):

so do you find splitting the relation in two an acceptable way forward ?

nikomatsakis (Aug 04 2020 at 14:31, on Zulip):

it'd be nice if datafrog were a bit more "independent" from vectors and slices

nikomatsakis (Aug 04 2020 at 14:31, on Zulip):

I could imagine having a specialized bit of code tailored to transitive relations, in particular

nikomatsakis (Aug 04 2020 at 14:32, on Zulip):

lqd said:

so do you find splitting the relation in two an acceptable way forward ?

I think it makes sense, although it definitely "messes with" the canonical rules

lqd (Aug 04 2020 at 14:32, on Zulip):

right right

nikomatsakis (Aug 04 2020 at 14:32, on Zulip):

i.e., even the naive rules start to get a bit messier than one might like

lqd (Aug 04 2020 at 14:32, on Zulip):

if we can find a cleaner alternative I'm all for it

nikomatsakis (Aug 04 2020 at 14:32, on Zulip):

but I still think it makes sense

nikomatsakis (Aug 04 2020 at 14:33, on Zulip):

right

lqd (Aug 04 2020 at 14:33, on Zulip):

it's not like we have to fix the 2 OOMs today heh

nikomatsakis (Aug 04 2020 at 14:33, on Zulip):

maybe we'll come up with another idea but in the meantime reproducing ALL those tuples for every CFG seems silly

nikomatsakis (Aug 04 2020 at 14:33, on Zulip):

one thing I do remember is that

nikomatsakis (Aug 04 2020 at 14:33, on Zulip):

at somep oint I was refactoring NLL to not have "locations::All"

lqd (Aug 04 2020 at 14:33, on Zulip):

it was historically a hack but the spirit is correct

nikomatsakis (Aug 04 2020 at 14:33, on Zulip):

and instead to enforce them at the start point

lqd (Aug 04 2020 at 14:33, on Zulip):

yeah I remember

nikomatsakis (Aug 04 2020 at 14:33, on Zulip):

or something like that

nikomatsakis (Aug 04 2020 at 14:33, on Zulip):

this would be I guess a good alternative

lqd (Aug 04 2020 at 14:33, on Zulip):

so to sum up:

nikomatsakis (Aug 04 2020 at 14:33, on Zulip):

I don't remember enough about that work but just not having the problem seems good too :)

lqd (Aug 04 2020 at 14:35, on Zulip):
nikomatsakis (Aug 04 2020 at 14:36, on Zulip):

Now I'm thinking it'd be better to look at the NLL code and see if I think there's a way to just remove the need for Locations::All to begin with...

lqd (Aug 04 2020 at 14:36, on Zulip):
nikomatsakis (Aug 04 2020 at 14:36, on Zulip):

(I don't have any other idea for an alternative beyond removing the "concept")

lqd (Aug 04 2020 at 14:37, on Zulip):
lqd (Aug 04 2020 at 14:37, on Zulip):
lqd (Aug 04 2020 at 14:38, on Zulip):

there was an issue for the Location::All topic btw https://github.com/rust-lang/polonius/issues/24 with a link to the rustc PR and a bit more

nikomatsakis (Aug 04 2020 at 14:41, on Zulip):

ah nice

nikomatsakis (Aug 04 2020 at 14:44, on Zulip):

(Yes, those notes help a lot)

lqd (Aug 04 2020 at 14:45, on Zulip):
lqd (Aug 04 2020 at 14:50, on Zulip):
nikomatsakis (Aug 04 2020 at 15:02, on Zulip):

Yeah, for sure.

nikomatsakis (Aug 04 2020 at 15:07, on Zulip):

so I looked a bit into this

nikomatsakis (Aug 04 2020 at 15:08, on Zulip):

I feel like all of the Location::All cases come from user type annotations

nikomatsakis (Aug 04 2020 at 15:09, on Zulip):

it seems like from my write-up that the problem comes from the liveness rules

nikomatsakis (Aug 04 2020 at 15:10, on Zulip):

which makes me wonder if the idea should be to enforce those user-type annotations at each assignment to the variable in question

nikomatsakis (Aug 04 2020 at 15:10, on Zulip):

the "return place" can be considered a special case of this

nikomatsakis (Aug 04 2020 at 15:10, on Zulip):

i.e., we desugar return foo to _0 = foo, and so it's an assignment to the local variable _0, and that might mean that this is a point to enforce the user-type annotation on _0 (i.e., the declared return type)

nikomatsakis (Aug 04 2020 at 15:11, on Zulip):

this also handles the case I cited here:

nikomatsakis (Aug 04 2020 at 15:11, on Zulip):
fn foo(mut x: &'static u32) {
  let y = 22;
  x = &y; // OK
  // ... here, `x` is not considered `'static` ..
}
nikomatsakis (Aug 04 2020 at 15:11, on Zulip):

i.e., x = &y would trigger us to enforce the user-type annotation again

nikomatsakis (Aug 04 2020 at 15:11, on Zulip):

and hence x would be considered 'static

nikomatsakis (Aug 04 2020 at 15:14, on Zulip):

I guess one example that doesn't fit this category, or maybe only "partly fits" it, is the "applied member constraints"

lqd (Aug 04 2020 at 15:15, on Zulip):

right there was also this topic

nikomatsakis (Aug 04 2020 at 15:19, on Zulip):

I guess we also create some constraints early on as part of the parameters

nikomatsakis (Aug 04 2020 at 15:19, on Zulip):

anyway, I think I'm semi-convinced that we could remove Locations::All in favor of a better handling of user-given type annotations, but it seems 'not entirely trivial' to do

nikomatsakis (Aug 04 2020 at 15:19, on Zulip):

it's also true that this will result in more tuples

lqd (Aug 04 2020 at 15:20, on Zulip):

would the only benefit be that we don't have to handle these cases specially ?

nikomatsakis (Aug 04 2020 at 15:20, on Zulip):

i.e., we'll be converting Location::All into regular tuples, often inserted close to the start of the graph, so they will propagate forward

nikomatsakis (Aug 04 2020 at 15:20, on Zulip):

I think that would be the only benefit

nikomatsakis (Aug 04 2020 at 15:21, on Zulip):

basically cleaner and simpler

nikomatsakis (Aug 04 2020 at 15:21, on Zulip):

but only on the polonius side, perhaps :)

lqd (Aug 04 2020 at 15:22, on Zulip):

maybe hard to justify then

nikomatsakis (Aug 04 2020 at 15:22, on Zulip):

probably worth keeping notes on it

nikomatsakis (Aug 04 2020 at 15:23, on Zulip):

as a possible future change

lqd (Aug 04 2020 at 15:23, on Zulip):

if it's non trivial and only helps this specific case, which is annoying sure but maybe less so than doing the NLL rework

nikomatsakis (Aug 04 2020 at 15:23, on Zulip):

yeah I'm trying to think if there would be a 'simple way' to do it

nikomatsakis (Aug 04 2020 at 15:25, on Zulip):

I could imagine for example

nikomatsakis (Aug 04 2020 at 15:25, on Zulip):

changing from Location::All

nikomatsakis (Aug 04 2020 at 15:26, on Zulip):

to something a bit more specific, e.g., Location::UserVariable(X)

nikomatsakis (Aug 04 2020 at 15:26, on Zulip):

and then, when we are generating nll-facts, we can find the right statements that correspond to writes of that variable

nikomatsakis (Aug 04 2020 at 15:26, on Zulip):

things like that

nikomatsakis (Aug 04 2020 at 15:26, on Zulip):

this might actually not be that hard to do

lqd (Aug 04 2020 at 15:26, on Zulip):

interesting yeah

nikomatsakis (Aug 04 2020 at 15:26, on Zulip):

and it wouldn't interfere with normal NLL

nikomatsakis (Aug 04 2020 at 15:27, on Zulip):

I guess I have to go and see if I could easily find a suitable Location value for each location

lqd (Aug 04 2020 at 15:29, on Zulip):

I thought we had some of this information more or less easily available but maybe it was for optimization purposes in later passes

lqd (Aug 04 2020 at 15:31, on Zulip):

at the same time I think I remember we did such similar mir walks during borrowck :thinking:

nikomatsakis (Aug 04 2020 at 15:36, on Zulip):

yes, so, looking into it, the main thing I don't know how to fit into this idea is the member-constraints

nikomatsakis (Aug 04 2020 at 15:36, on Zulip):

I suspect that actually they fit just fine but

nikomatsakis (Aug 04 2020 at 15:37, on Zulip):

the code which creates them doesn't have easy access to the right info

nikomatsakis (Aug 04 2020 at 15:37, on Zulip):

although maybe it can be passed down

nikomatsakis (Aug 04 2020 at 15:40, on Zulip):

oh

nikomatsakis (Aug 04 2020 at 15:40, on Zulip):

the code I was staring at is actually part of error reporting I think

lqd (Aug 04 2020 at 15:42, on Zulip):

and we might not even need to update it ?

nikomatsakis (Aug 04 2020 at 16:25, on Zulip):

sorry, got distracted eating lunch and other things

nikomatsakis (Aug 04 2020 at 16:26, on Zulip):

but yeah I am thinking I might attempt a change to introduce a parameter that ties Location::All to a local and see if I can actually find a viable choice for each case

lqd (Aug 05 2020 at 09:29, on Zulip):

so for this I won't open PRs of the subset-splitting branches above until we're sure Location::All wouldn't be better handled by the changes described right above this message

nikomatsakis (Aug 05 2020 at 09:30, on Zulip):

I'm actually about to start experimenting here

lqd (Aug 05 2020 at 09:31, on Zulip):

awesome :)

nikomatsakis (Aug 05 2020 at 09:55, on Zulip):

ok, well, I'm not sure if this will really work

nikomatsakis (Aug 05 2020 at 09:55, on Zulip):

some challenges I encountered:

nikomatsakis (Aug 05 2020 at 09:56, on Zulip):

well, idk, maybe it's ok

nikomatsakis (Aug 05 2020 at 09:56, on Zulip):

but basically there are some user-type annotations in places I had forgotten about

nikomatsakis (Aug 05 2020 at 09:56, on Zulip):

although I guess all of those have pretty clear locations, now that I think about it

nikomatsakis (Aug 05 2020 at 09:56, on Zulip):

e.g., constants, expressions like x: ty

nikomatsakis (Aug 05 2020 at 09:57, on Zulip):

indeed, in some of those cases where we are using Locations::All the behavior might even be confusing

lqd (Aug 05 2020 at 09:59, on Zulip):

in that the constraints can apply even before the annotations appear in the code ?

nikomatsakis (Aug 05 2020 at 09:59, on Zulip):

right

nikomatsakis (Aug 05 2020 at 09:59, on Zulip):

I have to look at the MIR we get I guess but

nikomatsakis (Aug 05 2020 at 10:00, on Zulip):

I am imagining something like

nikomatsakis (Aug 05 2020 at 10:00, on Zulip):
let mut x = 22;
let mut p = &x;

if false {
    p = &44;
    let _ = p: &'static i32;
}

drop(p);
nikomatsakis (Aug 05 2020 at 10:01, on Zulip):

ok it's a really weird example

nikomatsakis (Aug 05 2020 at 10:01, on Zulip):

but then all of these are :)

nikomatsakis (Aug 05 2020 at 10:01, on Zulip):

I do indeed get a compilation error from that

nikomatsakis (Aug 05 2020 at 10:02, on Zulip):

I guess that's not surprising

nikomatsakis (Aug 05 2020 at 10:02, on Zulip):

NLL isn't this flow sensitive

nikomatsakis (Aug 05 2020 at 10:02, on Zulip):

playground

nikomatsakis (Aug 05 2020 at 10:03, on Zulip):

let me look at what MIR we generate etc though

lqd (Aug 05 2020 at 10:04, on Zulip):

(gotta go for a bit I'll be back after lunch)

nikomatsakis (Aug 05 2020 at 10:14, on Zulip):

Yeah, I have to stop and go for my morning run :)

nikomatsakis (Aug 05 2020 at 10:14, on Zulip):

I'm still pursuing this a bit

nikomatsakis (Aug 05 2020 at 10:14, on Zulip):

I haven't quite given up yet ;)

nikomatsakis (Aug 05 2020 at 10:15, on Zulip):

I feel like there may be a PR here even if we kept Locations::All

lqd (Aug 05 2020 at 10:22, on Zulip):

nice :)

nikomatsakis (Aug 05 2020 at 13:38, on Zulip):

ok, I looked a bit more into it. There are a few annoying cases I've not yet fully understood.

nikomatsakis (Aug 05 2020 at 13:38, on Zulip):

One of the easier ones if the yield type

nikomatsakis (Aug 05 2020 at 13:39, on Zulip):

The way we handle user-type annotations is a bit complicated, especially stuff like

let (a: T1, b: T2): T3 = ...
nikomatsakis (Aug 05 2020 at 13:39, on Zulip):

but really I think all of the locations have a kind of location that can be ascribed

Last update: Jun 20 2021 at 00:45UTC