Stream: t-compiler/wg-polonius

Topic: liveness polonius#104


nikomatsakis (Apr 19 2019 at 20:25, on Zulip):

@Albin Stjerna I see in my to do list that I was supposed to provide some more notes about the next steps for polonius#104 -- I'm not 100% sure where you are

nikomatsakis (Apr 19 2019 at 20:25, on Zulip):

If you happen to be around, I have a few questions for you, otherwise I'll take a best guess

nikomatsakis (Apr 19 2019 at 20:54, on Zulip):

@Albin Stjerna ok you can see the notes here -- please leave any questions and I'll try to answer, or we can discuss on Tuesday

Albin Stjerna (Apr 20 2019 at 06:16, on Zulip):

I was just going to bed when this happened but I will have a look later! Thanks! And at a first glance this was roughly what I needed to get un-stuck, thank you!

Albin Stjerna (Apr 23 2019 at 12:23, on Zulip):

For what it's worth, these are my notes from when I tried to figure this out myself. I wasn't that far off!:

## How Calculation Works
1. all_factsis initialised in nll/mod.rs, in compute_regions()
2. the variable constraintsof type MirTypeckRegionConstraintsis extracted from type_check::type_check()’s return value
3. its field liveness_constraintscontains LivenessValues, and seems to be filled with live locals in liveness::generate(), which delegates the computation to trace::trace(), as filtered through regions_that_outlive_free_regions() and then compute_live_locals() (used to filter the set of variables to actually bother tracing)
## Notes
- The extraction from MIR happens in the LocalUseMap, which we can probably just steal and use.

## Questions
- should we still compute live locals, or should that pre-filtering too happen in Polonius? It looks very linear to me so maybe that’s fine?
- what happens with variable (“point”) IDs in translation? The important thing I guess is that it is the same as the ones used in the CFG?
- If we need to run both NLL and Polonius separately, how do we architect the liveness computation-ectomy, given that NLL needs it?
- What should we actually use for the Variable ID in rustc? How do we deal with the fact that variables seems to be unique to regions? Do we need to generate new IDs? Can we just use Locals?

Albin Stjerna (Apr 24 2019 at 15:23, on Zulip):

Ok, so I have a question now: I have something that seems to work and produces used/defined facts and serialises them to files in rustc, but I have no idea if they are correct as I don't really understand the output. Is there any obvious way to verify it?

Also, I am using the start_index of the Locations for the variables, is that reasonable? I'm not sure about the difference.

Albin Stjerna (Apr 24 2019 at 15:24, on Zulip):

(Ping @lqd and @nikomatsakis )

Albin Stjerna (Apr 24 2019 at 15:24, on Zulip):

(none of this is pushed anywhere)

lqd (Apr 24 2019 at 16:26, on Zulip):

I was thinking you might be able to compare your rustc output to your test inputs ?

lqd (Apr 24 2019 at 16:26, on Zulip):

(but maybe they’re only in the program facts form so this wouldn’t help)

lqd (Apr 24 2019 at 16:30, on Zulip):

I was also thinking this data could be related to the other facts, eg region_live_at, and comparing that relationship à la metamorphic testing

lqd (Apr 24 2019 at 16:32, on Zulip):

other than eyeballing and trying to compare to the existing NLL liveness (but I’m not sure they’re identical yet) :(

lqd (Apr 24 2019 at 16:33, on Zulip):

maybe @Matthew Jasper has ideas in addition to niko

nikomatsakis (Apr 24 2019 at 18:42, on Zulip):

Is there any obvious way to verify it?

Well, sort of maybe -- polonius had some kind of graphviz mode that we used for visualization, which might let you at least inspect it more naturally. But I don't remember the details and I kind of doubt it will help. Of course, when we get things far enough along, we'll be able to compare the behavior with the existing solver.

Also, I am using the start_index of the Locations for the variables, is that reasonable? I'm not sure about the difference.

So, the start point is where we enter the statement, and the mid point is kind of where the statement "takes effect". (To be honest, it would be good to revisit why we have these two points, and some other specifics of polonius -- I had hoped we'd be doing more regular video chats for that purpose.)

nikomatsakis (Apr 24 2019 at 18:43, on Zulip):

Anyway, to answer your question -- I think that we should be using the mid point

Albin Stjerna (Apr 25 2019 at 13:37, on Zulip):

Anyway, to answer your question -- I think that we should be using the mid point

I had a vague memory of this which is why I thought to ask; I'll change it to use the mid-points

Albin Stjerna (Apr 25 2019 at 14:44, on Zulip):

Ok, eyeballing the output and comparing it to the Mir for one of the test inputs, I think it's correct. Should I add the updated inputs with the var_used and var_defined facts to the liveness PR (polonius#105)?

Albin Stjerna (Apr 25 2019 at 14:45, on Zulip):

I have also opened rust#60266 and marked it as WIP

nikomatsakis (Apr 25 2019 at 14:56, on Zulip):

rust#60266

nikomatsakis (Apr 25 2019 at 14:56, on Zulip):

(I added a new linkifier for that format)

nikomatsakis (Apr 26 2019 at 21:53, on Zulip):

So @Albin Stjerna this looks basically right -- I guess we need to start laying out next steps, eh?

Albin Stjerna (Apr 27 2019 at 07:09, on Zulip):

@nikomatsakis yep!

nikomatsakis (Apr 30 2019 at 19:12, on Zulip):

OK, I added some notes onto polonius#104

nikomatsakis (Apr 30 2019 at 19:12, on Zulip):

they're kind of high-level, @Albin Stjerna, so feel free to fire questions

nikomatsakis (Apr 30 2019 at 19:12, on Zulip):

I think the order I would do things is to:

nikomatsakis (Apr 30 2019 at 19:13, on Zulip):

this way we should be able to test as we go

nikomatsakis (Apr 30 2019 at 19:17, on Zulip):

Also, maybe we should try to schedule some time to discuss and plan out again?

Albin Stjerna (Apr 30 2019 at 20:01, on Zulip):

Also, maybe we should try to schedule some time to discuss and plan out again?

Sounds good to me! I'll read through your instructions tomorrow and see where that gets me

Albin Stjerna (May 01 2019 at 06:49, on Zulip):

This is a great time to write the section of my thesis that explains what regions are, because I just realised I'm not 100% clear on that

nikomatsakis (May 01 2019 at 09:22, on Zulip):

Indeed, a good question. Polonius's take is that a "region" is itself sort of meaningless. It is a unique identifier for a particular reference, essentially. However, when you combine this region with subset relations and other things, we determine the relationship of regions to other regions -- basically, the aliasing relationships between references.

Albin Stjerna (May 03 2019 at 06:27, on Zulip):

@nikomatsakis I have added the logic to do drop-live calculations to polonius and rustc (see polonius#105 for the logic and rust#60266 for the fact generation). Was this what you had in mind?

nikomatsakis (May 03 2019 at 14:45, on Zulip):

@Albin Stjerna looking

nikomatsakis (May 06 2019 at 16:22, on Zulip):

@Albin Stjerna I answered you on GH -- sorry, it's hard for me to monitor GH notifications.

nikomatsakis (May 06 2019 at 16:22, on Zulip):

Hopefully that makes sense to you.

Albin Stjerna (May 06 2019 at 16:23, on Zulip):

I think so, but was I correct in my assumption that we do need to track drops for points and then do the same point-to-region translation for drops as we do for uses?

Albin Stjerna (May 06 2019 at 16:23, on Zulip):

Or have I gotten these mixed up completely?

Albin Stjerna (May 06 2019 at 16:24, on Zulip):

Albin Stjerna I answered you on GH -- sorry, it's hard for me to monitor GH notifications.

I understand; should I ask questions here or ping you in my questions in GH? What's the best way for you?

nikomatsakis (May 06 2019 at 16:24, on Zulip):

I have added the logic to do drop-live calculations to polonius and rustc (see polonius#105 for the logic and rust#60266 for the fact generation). Was this what you had in mind?

also, these commits look good

nikomatsakis (May 06 2019 at 16:24, on Zulip):

I understand; should I ask questions here or ping you in my questions in GH? What's the best way for you?

I'd prefer Zulip

Albin Stjerna (May 06 2019 at 16:24, on Zulip):

I understand; should I ask questions here or ping you in my questions in GH? What's the best way for you?

I'd prefer Zulip

Noted!

nikomatsakis (May 06 2019 at 16:25, on Zulip):

I think so, but was I correct in my assumption that we do need to track drops for points and then do the same point-to-region translation for drops as we do for uses?

Yes -- so polonius#105 is figuring out where things are "drop-live"

Albin Stjerna (May 06 2019 at 16:25, on Zulip):

Also, I'll try to be better at pinging you :)

nikomatsakis (May 06 2019 at 16:25, on Zulip):

and your previous commits figured out where things are "use-live"

nikomatsakis (May 06 2019 at 16:25, on Zulip):

er, to be clear, where variables are drop/use-live

nikomatsakis (May 06 2019 at 16:25, on Zulip):

the next step is to translate that into the points where regions are live

nikomatsakis (May 06 2019 at 16:26, on Zulip):

the rules are roughly like this:

region_live_at(R, P) :-
    var_live_at(V, P),
    var_uses_region(V, R).

region_live_at(R, P) :-
    var_drop_live_at(V, P),
    var_drops_region(V, R).
Albin Stjerna (May 06 2019 at 16:27, on Zulip):

And the var_uses_region is a fact, but where does it come from? Fixpoint iteration on subtype relationships from assignments?

nikomatsakis (May 06 2019 at 16:27, on Zulip):

no, it has nothing to do with that

nikomatsakis (May 06 2019 at 16:27, on Zulip):

it comes by enumerating the regions in the variable's type

Albin Stjerna (May 06 2019 at 16:27, on Zulip):

Aaah, ok

nikomatsakis (May 06 2019 at 16:27, on Zulip):

i.e., if the type of V is Foo<'1>, then we have var_uses_region(V, '1)

nikomatsakis (May 06 2019 at 16:28, on Zulip):

the "drops" relation is a bit more complex

nikomatsakis (May 06 2019 at 16:28, on Zulip):

in that it is a subset of the regions in the variable's type

nikomatsakis (May 06 2019 at 16:28, on Zulip):

though we could start by generating the same sets (it'd be an overapproximation)

nikomatsakis (May 06 2019 at 16:28, on Zulip):

that might be worthwhile because I think that the rustc analysis is more precise in some other ways, when it comes to drops

nikomatsakis (May 06 2019 at 16:28, on Zulip):

so we can focus on getting the polonius rules right, and have rustc generate too many tuples

nikomatsakis (May 06 2019 at 16:29, on Zulip):

then come back to this

Albin Stjerna (May 06 2019 at 16:33, on Zulip):

Ah, and it would be an over-approximation because not all those uses will actually end in a drop?

nikomatsakis (May 06 2019 at 16:37, on Zulip):

@Albin Stjerna what I meant by 'overapprox' is that the var_drops_region would have more tuples that we need

nikomatsakis (May 06 2019 at 16:37, on Zulip):

a simple example would be x: &'a u32, which is a no-op when dropped (and NLL knows that)

Albin Stjerna (May 06 2019 at 16:37, on Zulip):

Ah, ok you mean like that

Albin Stjerna (May 06 2019 at 16:37, on Zulip):

Ok, that makes sense

nikomatsakis (May 06 2019 at 16:37, on Zulip):

in fact, we'll actually handle that correctly anyway, for other reasons (MIR would have no drop for this type)

nikomatsakis (May 06 2019 at 16:41, on Zulip):

@Albin Stjerna I was trying to get you a few tips of where to edit

Albin Stjerna (May 06 2019 at 16:41, on Zulip):

Yay

nikomatsakis (May 06 2019 at 16:42, on Zulip):

If I'm not mistaken, the next step is to generate the var_uses_region and var_drops_region facts in rustc?

nikomatsakis (May 06 2019 at 16:53, on Zulip):

@Albin Stjerna Let me know if these instructions make sense to you

Albin Stjerna (May 06 2019 at 16:57, on Zulip):

Let me read through them

Albin Stjerna (May 06 2019 at 16:57, on Zulip):

But yes, I think that's the next step as well

Albin Stjerna (May 06 2019 at 17:05, on Zulip):

So I think what you want to do in your branch is to extend the "popular var liveness function" so that it iterates over the variables declared in mir (the mir.local_decls field).

First I thought this would be a bad idea because it would mean that the logic would get intermingled with the old logic for calculating liveness/regions, but I don't think that's a problem given that we are planning to remove those when the Polonius logic works, right?

Albin Stjerna (May 06 2019 at 17:06, on Zulip):

So basically, we'd just drop the region_live_at logic

Albin Stjerna (May 06 2019 at 17:07, on Zulip):

Which is interesting, because I guess we could have the compiler save the old values for region_live_at and compare it to the one gotten from Polonius to ensure it's consistent

nikomatsakis (May 06 2019 at 17:13, on Zulip):

Yeah, we could do that, and yes, that logic will eventually go away

nikomatsakis (May 06 2019 at 17:14, on Zulip):

Keeping the ability to compare is probably a good idea

Albin Stjerna (May 06 2019 at 17:32, on Zulip):

Great! Then I have an idea of where to start tomorrow

Albin Stjerna (May 06 2019 at 17:53, on Zulip):

I'll try to make note of candidates for refactoring for when the precalculation-ectomy happens as well, maybe I can make the code less hard to follow and/or add some documentation

nikomatsakis (May 07 2019 at 16:16, on Zulip):

@Albin Stjerna how goes?

Albin Stjerna (May 07 2019 at 17:26, on Zulip):

@nikomatsakis I spent the morning writing the background section of my report/thesis that explains regions etc while reading through your instructions in the issue on GitHub and cross-referencing with your blog posts, and I think that was a good way to figure out what was going on, but I only got so far as to implement the facts for var_use region and var_drops_region to Polonius. So far things are going well, and I think I have a clear idea of what to do to generate the facts in Rust, but I never got to that before having a migraine attack, and I've been asleep since

Albin Stjerna (May 07 2019 at 17:31, on Zulip):

I did have some trouble figuring out how to add the facts to the polonius-parser lalrpop grammar though, because I'm not sure how to add facts that aren't connected to a point, but I just put that off until later

Albin Stjerna (May 08 2019 at 14:22, on Zulip):

@nikomatsakis var_drops_region and var_uses_region fact generation pushed. The code is more opaque than I would have liked it as I had to work around ownership of some of the essentially global variables passed around between functions, but it appears to be working™

nikomatsakis (May 08 2019 at 17:16, on Zulip):

@Albin Stjerna can you send me a link to the PR(s)?

lqd (May 08 2019 at 17:40, on Zulip):

we should probably put them both in the topic title: I think it's polonius#105 and #60266 for the rustc side

Albin Stjerna (May 08 2019 at 21:00, on Zulip):

we should probably put them both in the topic title: I think it's polonius#105 and #60266 for the rustc side

That sounds like a good idea, but I also think it's just the Datafrog code left before the whole liveness logic is done, more or less?

Albin Stjerna (May 08 2019 at 21:02, on Zulip):

Also, @nikomatsakis: @lqd was right, it's polonius#105 and rust#60266

Albin Stjerna (May 08 2019 at 21:03, on Zulip):

I'll get to that tomorrow!

Albin Stjerna (May 09 2019 at 09:53, on Zulip):

@nikomatsakis I just realised that it seems that region live uses seems to be generated for both the start and mid-points. Does that mean I should emit a var_live at both the start and mid-points? Does this matter?

Albin Stjerna (May 09 2019 at 09:56, on Zulip):

I...think this should just work out as long as the variable liveness calculations are correct?

nikomatsakis (May 09 2019 at 09:57, on Zulip):

I think that should just fall out naturally, @Albin Stjerna --

nikomatsakis (May 09 2019 at 09:57, on Zulip):

we propagate liveness backwards through the CFG

Albin Stjerna (May 09 2019 at 09:57, on Zulip):

Hmh

nikomatsakis (May 09 2019 at 09:57, on Zulip):

So if you have a "use" at the mid point, it will be "live" at the entry point

Albin Stjerna (May 09 2019 at 09:57, on Zulip):

Why is the region_live_at relation a 3-tuple?

Albin Stjerna (May 09 2019 at 09:58, on Zulip):

Err, I meant variable

nikomatsakis (May 09 2019 at 09:58, on Zulip):

What are the 3 things? I forget

nikomatsakis (May 09 2019 at 09:58, on Zulip):

it could be a mistake :)

Albin Stjerna (May 09 2019 at 09:58, on Zulip):

Region, Point, and...the empty tuple

nikomatsakis (May 09 2019 at 09:58, on Zulip):

oh

nikomatsakis (May 09 2019 at 09:58, on Zulip):

that's a "technicality"

nikomatsakis (May 09 2019 at 09:58, on Zulip):

sometimes datafrog wants things to have (key, value) form

Albin Stjerna (May 09 2019 at 09:58, on Zulip):

Aah, ok

nikomatsakis (May 09 2019 at 09:58, on Zulip):

but in this case the (region, point) is the key

Albin Stjerna (May 09 2019 at 10:00, on Zulip):

I ran into some trouble converting the region_live_at back to the relation copy, which I presume must be done at each iteration to keep up with changes

Albin Stjerna (May 09 2019 at 10:00, on Zulip):

But then it's just a matter of adapting it

nikomatsakis (May 09 2019 at 10:00, on Zulip):

uh

nikomatsakis (May 09 2019 at 10:01, on Zulip):

I'm confused :)

nikomatsakis (May 09 2019 at 10:01, on Zulip):

oh, well, we should be removing the region-live-at relation I thnk

nikomatsakis (May 09 2019 at 10:01, on Zulip):

it would just become a variable

Albin Stjerna (May 09 2019 at 10:01, on Zulip):

I thought it was technically necessary for some Datafrog shenanigans?

nikomatsakis (May 09 2019 at 10:01, on Zulip):

(maybe rename the existing one)

nikomatsakis (May 09 2019 at 10:01, on Zulip):

hmm, oh, because we're using leapjoin?

nikomatsakis (May 09 2019 at 10:02, on Zulip):

we probably need to stop using leapjoin then

Albin Stjerna (May 09 2019 at 10:02, on Zulip):

I guess that makes sense

nikomatsakis (May 09 2019 at 10:02, on Zulip):

relations are supposed to be "fixed sets that are not varying"

nikomatsakis (May 09 2019 at 10:02, on Zulip):

I think we should stick to that :)

nikomatsakis (May 09 2019 at 10:02, on Zulip):

otoh

nikomatsakis (May 09 2019 at 10:02, on Zulip):

wait :)

Albin Stjerna (May 09 2019 at 10:02, on Zulip):

Err yes now that you mention that, it sounds like a bad idea to work around the fundamentals of the system

nikomatsakis (May 09 2019 at 10:02, on Zulip):

we could do the liveness computation as a pre-step

nikomatsakis (May 09 2019 at 10:02, on Zulip):

and create the relation only at the end

nikomatsakis (May 09 2019 at 10:02, on Zulip):

i.e., we can have two iterations

Albin Stjerna (May 09 2019 at 10:03, on Zulip):

I considered that as well, but I'm not sure if it's a good idea

nikomatsakis (May 09 2019 at 10:03, on Zulip):

I think it would be

nikomatsakis (May 09 2019 at 10:03, on Zulip):

among other things, it's probably better locality

Albin Stjerna (May 09 2019 at 10:03, on Zulip):

Hmm yes probably because propagation is always additive

nikomatsakis (May 09 2019 at 10:03, on Zulip):

in general, it makes sense for us to "stratify" when we can I think

Albin Stjerna (May 09 2019 at 10:03, on Zulip):

So there's no such thing as having partially completed solutions eliminate parts of the search space for other variables

nikomatsakis (May 09 2019 at 10:04, on Zulip):

not sure if I understand

nikomatsakis (May 09 2019 at 10:04, on Zulip):

oh, I see, you're saying "beacuse it's purely additive, that doesn't happen"

nikomatsakis (May 09 2019 at 10:04, on Zulip):

yeah, all of the relations are monotonic -- ever growing

Albin Stjerna (May 09 2019 at 10:05, on Zulip):

I'm still too stuck in discrete optimisation where it's the opposite: variables start out enormous and loose values

Albin Stjerna (May 09 2019 at 10:05, on Zulip):

But yes, I guess the liveness is completely orthogonal

nikomatsakis (May 09 2019 at 10:05, on Zulip):

I see. Still, even there, if you have two variables X and Y, where X depends on Y but not vice versa, it's probably better to iterate and find the minimal Y first..? Anyway, I don't know much about that.

Albin Stjerna (May 09 2019 at 10:05, on Zulip):

And having it separate would make it easier to share code between the various implementations

nikomatsakis (May 09 2019 at 10:06, on Zulip):

Yes

nikomatsakis (May 09 2019 at 10:06, on Zulip):

I was thinking about that too

nikomatsakis (May 09 2019 at 10:06, on Zulip):

this would probably be a separate file that naive + opt etc can all share

Albin Stjerna (May 09 2019 at 10:06, on Zulip):

Not necessarily, because you could learn things about the relations between X and Y such that you can use a partial assignment of Y to immediately shrink the search space of X

Albin Stjerna (May 09 2019 at 10:07, on Zulip):

Which might prevent you from exploring large amounts of non-solutions, or worse solutions if you are optimising

Albin Stjerna (May 09 2019 at 10:07, on Zulip):

But as this is just fix-point set membership, it shouldn't matter. I'll get right to extracting the liveness logic

Albin Stjerna (May 09 2019 at 11:36, on Zulip):

Ok, I have something that seems to work now, but doesn't pass tests. I'll investigate closer later :)

Albin Stjerna (May 09 2019 at 11:37, on Zulip):

It also doesn't dump the tuples internal to the liveness calculations when dump is enabled

Albin Stjerna (May 09 2019 at 11:37, on Zulip):

And I'm not using the drop-liveness for anything yet, which makes me think that something's still missing

nikomatsakis (May 09 2019 at 12:05, on Zulip):

@Albin Stjerna you need a rule like

region_live_at(R, P) :- drop_live_at(V, P), var_drops_region(V, R)
nikomatsakis (May 09 2019 at 12:06, on Zulip):

could that be what's missing?

Albin Stjerna (May 09 2019 at 14:16, on Zulip):

That shouldn’t make any difference should it? Isn’t the over-estimates drop_live_at identical to var_live_at:

for ... {
    add_regions(local, local_decl.ty, &mut all_facts.var_uses_regions);
    add_regions(local, local_decl.ty, &mut all_facts.var_drops_regions); // FIXME(polonius#104) overapproximation
}
Albin Stjerna (May 09 2019 at 14:18, on Zulip):

Also I had the impression that the idea was to have a separate region logic for the dropped-live relation?

nikomatsakis (May 09 2019 at 16:19, on Zulip):

And I'm not using the drop-liveness for anything yet, which makes me think that something's still missing

you said you are not using drop-liveness for anything

nikomatsakis (May 09 2019 at 16:19, on Zulip):

I was saying where I expected the drop_live_at relation to be used

nikomatsakis (May 09 2019 at 16:20, on Zulip):

Note though that even though var_drops_region and var_uses_region will indeed be the same, the drop_live_at relation is different -- it is "generated" from each drop, but the use_live_at relation is generated from each use

Albin Stjerna (May 10 2019 at 08:25, on Zulip):

Note though that even though var_drops_region and var_uses_region will indeed be the same, the drop_live_at relation is different -- it is "generated" from each drop, but the use_live_at relation is generated from each use

Ah, of course. So a region can be made live by either being drop-live or use-live, and the lookup in var_drops_region is just that, a lookup, which is why it's safe to over-estimate it. For some reason I had it in my mind that we were going to separate drops and uses all the way, so that we calculated a region_live_at and a region_drop_live_at, which is why I was confused.

Albin Stjerna (May 10 2019 at 11:08, on Zulip):

Hm, ok all test inputs except the foo functions of vec-push-ref produces identical output to the one from rustc for region_live_at. They generate a relatively large set of relations (in the range of hundreds), so diffing them is not easy, but I'll look into it

Albin Stjerna (May 10 2019 at 13:39, on Zulip):

Ok, the difference seems to be the following:

"Start(bb0[11])": input region_live_at "\'_#14r" != calculated "\'_#15r", "\'_#14r"
"Mid(bb0[11])": input region_live_at "\'_#14r" != calculated "\'_#15r", "\'_#14r"
"Start(bb4[0])": input region_live_at "\'_#13r" != calculated "\'_#14r", "\'_#13r"
"Mid(bb4[0])": input region_live_at "\'_#13r" != calculated "\'_#14r", "\'_#13r"
"Start(bb13[3])": input region_live_at "\'_#21r" != calculated "\'_#13r", "\'_#21r"
"Mid(bb13[3])": input region_live_at "\'_#21r" != calculated "\'_#13r", "\'_#21r"

I'm trying to look into what is happening now

signed, albin --verbose

Albin Stjerna (May 10 2019 at 13:39, on Zulip):

Also, I'm experimenting with adding a flag to do comparisons for region_live_at to the polonius binary

Albin Stjerna (May 10 2019 at 14:09, on Zulip):
 RUST_LOG=info ./target/debug/polonius -a Naive -v -c inputs/vec-push-ref/nll-facts/foo1
 INFO 2019-05-10T14:09:11Z: polonius::cli: region facts compare mode enabled: ignoring provided region_live_at
 INFO 2019-05-10T14:09:11Z: polonius_engine::output::liveness: compute_liveness() completed: 130 tuples, 3.22311ms
 INFO 2019-05-10T14:09:11Z: polonius_engine::output::naive: errors is complete: 1 tuples, 5.607188ms
--------------------------------------------------
Directory: inputs/vec-push-ref/nll-facts/foo1
Time: 0.010s
MISMATCH at location "Start(bb16[2])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Mid(bb4[0])": calculated has +["\'_#14r"], -[]
MISMATCH at location "Mid(bb14[0])": calculated has +["\'_#21r", "\'_#13r"], -[]
MISMATCH at location "Mid(bb0[11])": calculated has +["\'_#15r"], -[]
MISMATCH at location "Mid(bb7[0])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Mid(bb16[2])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Start(bb16[0])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Start(bb16[3])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Mid(bb15[0])": calculated has +["\'_#13r", "\'_#21r"], -[]
MISMATCH at location "Start(bb13[3])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Start(bb2[0])": calculated has +["\'_#14r"], -[]
MISMATCH at location "Start(bb7[0])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Mid(bb16[3])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Mid(bb16[0])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Mid(bb13[3])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Start(bb16[1])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Start(bb14[0])": calculated has +["\'_#21r", "\'_#13r"], -[]
MISMATCH at location "Mid(bb16[1])": calculated has +["\'_#13r"], -[]
MISMATCH at location "Start(bb4[0])": calculated has +["\'_#14r"], -[]
MISMATCH at location "Start(bb15[0])": calculated has +["\'_#13r", "\'_#21r"], -[]
MISMATCH at location "Mid(bb2[0])": calculated has +["\'_#14r"], -[]
MISMATCH at location "Start(bb0[11])": calculated has +["\'_#15r"], -[]
Albin Stjerna (May 10 2019 at 14:10, on Zulip):

Ok, so there's an over-approximation of the liveness of some regions

lqd (May 10 2019 at 22:41, on Zulip):

there's also a Compare algorithm for debugging/testing (initially to check for differences between Naive and DatafrogOpt), maybe it can be extended for these comparisons as well (although I believe the goal is to eventually remove the rustc-exported region_live_at facts, so it would maybe only be useful for a short while ?)

Albin Stjerna (May 13 2019 at 08:44, on Zulip):

@lqd Yes, I think this is all very temporary I guess.

Albin Stjerna (May 13 2019 at 08:47, on Zulip):

Update: I have updated the graphviz output to account for all the liveness facts and outputs (including both region liveness as input and output), and am going through the discrepancies. So far it has been the case every time that a live region in Polonius output that wasn't in the rustc input was derived from a drop-live variable that was correctly calculated as live from the input, so I think this is just an over-approximation

Albin Stjerna (May 13 2019 at 09:46, on Zulip):

Yes, I can confirm this is the case for all mismatches in foo1 of push-vec-ref

Albin Stjerna (May 13 2019 at 09:51, on Zulip):

...and some samples from foo2 indicates the same problem

nikomatsakis (May 13 2019 at 18:27, on Zulip):

@Albin Stjerna seems like trying to get drop-live correct is a good next step

Albin Stjerna (May 13 2019 at 18:28, on Zulip):

Albin Stjerna seems like trying to get drop-live correct is a good next step

Yup! Also, I tried removing the generation of region_live_at in rustc and it seems to cause a panic in the compiler due to an unwrap() in some compiletest cases, but I haven't managed to look into it.

Albin Stjerna (May 13 2019 at 18:29, on Zulip):

(I --bless:ed the output from before I removed the logic and used that for comparison)

nikomatsakis (May 13 2019 at 18:37, on Zulip):

OK, @Albin Stjerna --

nikomatsakis (May 13 2019 at 18:37, on Zulip):

I guess I'll try dumping some notes here?

nikomatsakis (May 13 2019 at 18:37, on Zulip):

Or would you rather I leave them in Github

Albin Stjerna (May 13 2019 at 18:38, on Zulip):

Whichever is the most convenient for you works for me!

nikomatsakis (May 13 2019 at 18:39, on Zulip):

First off, you understand why the "drop live" set of regions is a smaller set than the "use live"?

Albin Stjerna (May 13 2019 at 18:39, on Zulip):

@nikomatsakis Only very vaguely; I do realise that it might not always be necessary to actually have the referenced value in order to perform a drop, is that correct?

nikomatsakis (May 13 2019 at 18:40, on Zulip):

Yes. The most obvious example is &T -- dropping a reference is a no-op

nikomatsakis (May 13 2019 at 18:40, on Zulip):

that's kind of the "definition" of borrowing--

nikomatsakis (May 13 2019 at 18:40, on Zulip):

you don't free the memory when your borrowed reference goes out of scope ;)

Albin Stjerna (May 13 2019 at 18:41, on Zulip):

Ah, no that would be...bad

nikomatsakis (May 13 2019 at 18:41, on Zulip):

In any case, we have a bunch of existing infrastructure around this, mostly to support cyclic data structures in some cases.

nikomatsakis (May 13 2019 at 18:41, on Zulip):

To some extent that's historical (pre-exists the move to NLL, anyway)

nikomatsakis (May 13 2019 at 18:42, on Zulip):

the summary is roughly this:

nikomatsakis (May 13 2019 at 18:42, on Zulip):

so e.g. we know that a Vec<&T> does not, in its destructor, use those references

nikomatsakis (May 13 2019 at 18:42, on Zulip):

this permits you to create cycles in some cases

nikomatsakis (May 13 2019 at 18:43, on Zulip):

e.g., you can have an arena that allocates memory

nikomatsakis (May 13 2019 at 18:43, on Zulip):

and you can allocate a vector in that arena

nikomatsakis (May 13 2019 at 18:43, on Zulip):

which contains references to other things allocated from that arena

Albin Stjerna (May 13 2019 at 18:43, on Zulip):

Ah, and the vector of course doesn't bother managing the actual data it saves references to, makes sense

nikomatsakis (May 13 2019 at 18:43, on Zulip):

this is a "cycle" of sorts because we cannot really tell whether the vector or the things it is referencing will be freed first

nikomatsakis (May 13 2019 at 18:43, on Zulip):

(but we don't care)

nikomatsakis (May 13 2019 at 18:43, on Zulip):

(since we know that the vector won't be accessing those things)

nikomatsakis (May 13 2019 at 18:44, on Zulip):

so that's kind of the "background"

nikomatsakis (May 13 2019 at 18:44, on Zulip):

what we basically want to do is to leverage this already existing code

nikomatsakis (May 13 2019 at 18:44, on Zulip):

which essentially, given a type T, will figure out what set of other types have to be live when T is dropped

Albin Stjerna (May 13 2019 at 18:45, on Zulip):

Hm, this sounds like it might translate to some sort of transitive logic

nikomatsakis (May 13 2019 at 18:45, on Zulip):

that is what this compute_drop_data function is doing

nikomatsakis (May 13 2019 at 18:45, on Zulip):

well, polonius doesn't really have to do this bit

Albin Stjerna (May 13 2019 at 18:45, on Zulip):

I figured it sounded very much like something that must happen in rustc

Albin Stjerna (May 13 2019 at 18:46, on Zulip):

Or at least should

nikomatsakis (May 13 2019 at 18:46, on Zulip):

YEah

nikomatsakis (May 13 2019 at 18:46, on Zulip):

I guess the real function that you want to start from is add_drop_live_facts_for

Albin Stjerna (May 13 2019 at 18:47, on Zulip):

But can this just be done in rustc, i.e. by not emitting drop-use facts in the right places, or is there more to this?

nikomatsakis (May 13 2019 at 18:48, on Zulip):

yeah, it can just be done in rustc, the idea is that we will emit the var_drops_region facts

nikomatsakis (May 13 2019 at 18:48, on Zulip):

to account for this

nikomatsakis (May 13 2019 at 18:48, on Zulip):

i.e., if you have a variable x with a type like Vec<&'a u32>

nikomatsakis (May 13 2019 at 18:48, on Zulip):

you would have a var_uses_region(x, 'a) fact

nikomatsakis (May 13 2019 at 18:48, on Zulip):

but you would not have a var_drops_region(x, 'a) fact

nikomatsakis (May 13 2019 at 18:49, on Zulip):

thus, if x is use-live at some point, the region 'a must be live

nikomatsakis (May 13 2019 at 18:49, on Zulip):

but if x is only drop-live at some point, region 'a may not be live

Albin Stjerna (May 13 2019 at 18:49, on Zulip):

but it won't drop the u32 when it's deallocated

Albin Stjerna (May 13 2019 at 18:50, on Zulip):

Hm, but isn't that the case for all references? I thought the owning data structure was responsible for the deallocation?

Albin Stjerna (May 13 2019 at 18:51, on Zulip):

err, ok but of course not the liveness, ok I get it

Albin Stjerna (May 13 2019 at 18:51, on Zulip):

(I think)

nikomatsakis (May 13 2019 at 18:51, on Zulip):

it's not really about the u32

nikomatsakis (May 13 2019 at 18:51, on Zulip):

it's about the reference &'a u32 -- is somebody going to potentially dereference it in the future?

nikomatsakis (May 13 2019 at 18:52, on Zulip):

if the vector is "use-live", then we are accessing it in some ordinary way

nikomatsakis (May 13 2019 at 18:52, on Zulip):

e.g., invoking a methed like pop or whatever

nikomatsakis (May 13 2019 at 18:52, on Zulip):

so we have to assume that whatever references are in it may get used

nikomatsakis (May 13 2019 at 18:52, on Zulip):

but if we know the vector is just going to get dropped, we know they won't (because of this pre-existing drop analysis)

Albin Stjerna (May 13 2019 at 18:52, on Zulip):

oh, ok, so the point is that the deallocator for the vec might need what the reference is referencing?

Albin Stjerna (May 13 2019 at 18:53, on Zulip):

or in this case, that it won't

nikomatsakis (May 13 2019 at 18:53, on Zulip):

right. it doesn't, but other drop code could

Albin Stjerna (May 13 2019 at 18:53, on Zulip):

ok, I see, and if something implements Drop then it could do anything and all bets are off

nikomatsakis (May 13 2019 at 18:53, on Zulip):

e.g., maybe you have

struct Foo<'a> {
    data: &'a u32
}

impl<'a> Drop for Foo<'a> {
  fn drop(&mut self) {
    /* we assume this could do something like `*self.data;` */
  }
}
nikomatsakis (May 13 2019 at 18:53, on Zulip):

in that case, if you have y: Foo<'a> you would have var_drops_region(y, 'a)

nikomatsakis (May 13 2019 at 18:54, on Zulip):

(though really we're more conservative than that -- i.e., as you say, we don't examine the body of the drop closely)

nikomatsakis (May 13 2019 at 18:55, on Zulip):

(where is your branch again, @Albin Stjerna ?)

nikomatsakis (May 13 2019 at 18:56, on Zulip):

Anyway, the relevant bit of the older code is this for loop here

nikomatsakis (May 13 2019 at 18:56, on Zulip):
 // All things in the `outlives` array may be touched by
        // the destructor and must be live at this point.
        for &kind in &drop_data.dropck_result.kinds {
            Self::make_all_regions_live(
                self.elements,
                &mut self.typeck,
                kind,
                live_at,
                self.location_table,
            );
}
nikomatsakis (May 13 2019 at 18:56, on Zulip):

this "drop data" is cached from the function compute_drop_data here

Albin Stjerna (May 13 2019 at 18:56, on Zulip):

For rustc, it's here

nikomatsakis (May 13 2019 at 18:57, on Zulip):

basically the drop_data.dropck_result.kinds is a list of types/regions things that the Drop code "may" access

nikomatsakis (May 13 2019 at 18:57, on Zulip):

so what we do is we iterate over each of those and make sure they are live

nikomatsakis (May 13 2019 at 18:57, on Zulip):

I imagine your code would want to do something similar, but emitted var_drops_region facts for each thing contained within

Albin Stjerna (May 13 2019 at 18:58, on Zulip):

ok, but doesn't that mean we will emot more var_drops_regionand not fewer?

nikomatsakis (May 13 2019 at 18:58, on Zulip):

( now, before that for loop, we do some other stuff -- this actually isn't, I don't think, relevant to what we are trying to do right now. It's a separate consideration. )

Albin Stjerna (May 13 2019 at 18:58, on Zulip):

Or do you mean we do this in stead of the previous logic?

nikomatsakis (May 13 2019 at 18:58, on Zulip):

Or do you mean we do this in stead of the previous logic?

that

Albin Stjerna (May 13 2019 at 18:59, on Zulip):

Ok!

nikomatsakis (May 13 2019 at 19:00, on Zulip):

I'm not sure how difficult it will be to re-use that existing stuff

Albin Stjerna (May 13 2019 at 19:00, on Zulip):

So far my strategy of trying something and then fixing the compiler errors has worked out quite well

nikomatsakis (May 13 2019 at 19:01, on Zulip):

from a quick glance I suspect you can just kind of "copy-paste" what you need ..

nikomatsakis (May 13 2019 at 19:01, on Zulip):

ok, so, I think that's it for the var-drop-region business

nikomatsakis (May 13 2019 at 19:01, on Zulip):

I thought there was more to it, but I realize now I was misremembering

Albin Stjerna (May 13 2019 at 19:01, on Zulip):

Well, let's start there tomorrow and see what breaks

nikomatsakis (May 13 2019 at 19:02, on Zulip):

sounds good

Albin Stjerna (May 13 2019 at 19:02, on Zulip):

I mean, spending a day looking at the graphviz outputs and tracing liveness etc was an excellent exercise

Albin Stjerna (May 13 2019 at 19:03, on Zulip):

By the way, these inputs, are they from NLL's borrowcheck?

Albin Stjerna (May 13 2019 at 19:04, on Zulip):

I'm trying to get a feel for which parts would need to be re-engineered when it's possible to run Polonius without NLL (and also just how things fit together in general), and so far my hunch is "virtually everything"

nikomatsakis (May 13 2019 at 21:06, on Zulip):

By the way, these inputs, are they from NLL's borrowcheck?

the NLL "borrow check" includes a "type check"

nikomatsakis (May 13 2019 at 21:06, on Zulip):

the liveness constraint code you are editing is part of that

nikomatsakis (May 13 2019 at 21:06, on Zulip):

this type check generates the base constraints (the subset relations, in polonius speak)

nikomatsakis (May 13 2019 at 21:07, on Zulip):

actually most of the NLL code is still needed

nikomatsakis (May 13 2019 at 21:07, on Zulip):

it's only a relatively small part -- the "region inference" -- that goes away completely

Albin Stjerna (May 14 2019 at 13:03, on Zulip):

Ah, ok I see

Albin Stjerna (May 14 2019 at 15:51, on Zulip):

@nikomatsakis I have modified rustc to output var_drops_region according to your instructions; it compiles and the resulting liveness analysis is the same for all the inputs except clap-rs, which I haven't tested. But none of those actually produce any tuples for var_drops_region, so I have no positive examples

Albin Stjerna (May 14 2019 at 15:51, on Zulip):

(I'm going away for a while now, but I'll be back for the meeting)

Albin Stjerna (May 14 2019 at 16:33, on Zulip):

...nope, clap-rs doesn't either. Apparently, drop-liveness isn't that common

nikomatsakis (May 14 2019 at 17:37, on Zulip):

nikomatsakis I have modified rustc to output var_drops_region according to your instructions; it compiles and the resulting liveness analysis is the same for all the inputs except clap-rs, which I haven't tested. But none of those actually produce any tuples for var_drops_region, so I have no positive examples

lol

nikomatsakis (May 14 2019 at 17:38, on Zulip):

@Albin Stjerna I can point you at some test cases

Albin Stjerna (May 14 2019 at 18:14, on Zulip):

Albin Stjerna I can point you at some test cases

That would be great! Thanks :)

Albin Stjerna (May 14 2019 at 18:22, on Zulip):

@nikomatsakis ok it seems to work for the example you gave above too, which does generate one var_drops_region fact

nikomatsakis (May 14 2019 at 18:23, on Zulip):

@Albin Stjerna try e.g. drop-may-dangle

nikomatsakis (May 14 2019 at 18:23, on Zulip):

and drop-no-may-dangle

nikomatsakis (May 14 2019 at 18:23, on Zulip):

basically any test in src/test/ui/nll/ with drop in the name :)

nikomatsakis (May 14 2019 at 18:24, on Zulip):

e.g. enum-drop-access

Albin Stjerna (May 15 2019 at 16:20, on Zulip):

I have added a set of tests that compares the region_live_at output to the input from rustc, and it seems that there is a slight discrepancy in some tests. I'll look into it later, but it seems that a region is live in Polonius input when it isn't in Rustc

Albin Stjerna (May 16 2019 at 08:20, on Zulip):

OK, I have looked into it, and it looks sort of reasonable to me. What seems to happen is that variable _2 is drop-used at Mid(bb16[1]), which makes it drop-live, and more importantly its region _#r13. However, that region isn't reported as live by rustc. It's all running this code:

struct Wrap<'p> { p: &'p mut i32 }

impl<'p> Drop for Wrap<'p> {
    fn drop(&mut self) {
        *self.p += 1;
    }
}

struct Foo<'p> { a: String, b: Wrap<'p> }

fn main() {
    let mut x = 0;
    let wrap = Wrap { p: &mut x };
    let s = String::from("str");
    let foo = Foo { a: s, b: wrap };
    std::mem::drop(foo.a);
    std::mem::drop(foo.b);
    x = 1; //~ ERROR cannot assign to `x` because it is borrowed [E0506]
    // FIXME ^ This currently errors and it should not.
}

The MIR of bb16and bb17 looks like this:

bb16: {
     StorageDead(_5);                 // bb16[0]: scope 4 at maybe-initialized-drop-with-uninitialized-fragments.rs:24:1: 24:2
     drop(_2) -> [return: bb17, unwind: bb1]; // bb16[1]: scope 2 at maybe-initialized-drop-with-uninitialized-fragments.rs:24:1: 24:2
 }

 bb17: {
     StorageDead(_2);                 // bb17[0]: scope 2 at maybe-initialized-drop-with-uninitialized-fragments.rs:24:1: 24:2
     StorageDead(_1);                 // bb17[1]: scope 0 at maybe-initialized-drop-with-uninitialized-fragments.rs:24:1: 24:2
     return;                          // bb17[2]: scope 0 at maybe-initialized-drop-with-uninitialized-fragments.rs:24:2: 24:2
 }

PDF of the graph with all facts and outputs from Polonius

Albin Stjerna (May 16 2019 at 08:23, on Zulip):

Is there any further optimisation in rustc that I'm not thinking of, or is this a bug?

nikomatsakis (May 16 2019 at 18:05, on Zulip):

Ah, @Albin Stjerna, right -- I knew there was a wrinkle

nikomatsakis (May 16 2019 at 18:06, on Zulip):

I somehow overlooked it though

nikomatsakis (May 16 2019 at 18:06, on Zulip):

So rustc is correct, but our analysis is a bit too simplistic as we've done it so far

nikomatsakis (May 16 2019 at 18:06, on Zulip):

This actually touches on the next step -- computing moves and initialization

nikomatsakis (May 16 2019 at 18:06, on Zulip):

I have to look at how this works in rustc, but the idea is that we know that foo.b has been moved

nikomatsakis (May 16 2019 at 18:07, on Zulip):

To be clear though, rustc is not giving an error here? But polonius is?

nikomatsakis (May 16 2019 at 18:07, on Zulip):

Because the FIXME etc above suggests rustc does

Albin Stjerna (May 16 2019 at 20:01, on Zulip):

@nikomatsakis

To be clear though, rustc is not giving an error here? But polonius is?

No they both give errors, but the difference is that Polonius gives a different region_live_at than rustc, with a region being live in more places than in the rustc output

Albin Stjerna (May 16 2019 at 20:04, on Zulip):

This happens for all the similar examples, so I think it's the same problem

nikomatsakis (May 16 2019 at 22:43, on Zulip):

@Albin Stjerna So, I think the crucial part of the code in rustc is this line here:

 if self.cx.initialized_at_terminator(location.block, mpi) {
    ...
}
nikomatsakis (May 16 2019 at 22:43, on Zulip):

the idea is that we only consider a Drop(x) instruction in the MIR to be "live" if x is initialized (i.e., has a value that has not been moved out in the interim)

nikomatsakis (May 16 2019 at 22:44, on Zulip):

this is because, in Rust, we have stuff like this:

{
    let x = vec![];
    send(x); // moves `x` away
} // `x` is not dropped here
nikomatsakis (May 16 2019 at 22:44, on Zulip):

in the MIR, we generate a Drop(x) either way to start, but then we later (in a phase called "drop elaboration") remove it

nikomatsakis (May 16 2019 at 22:44, on Zulip):

at the time when borrowck runs, the drop is still there, but it is considered to only be "active" if x is (maybe) initialized

nikomatsakis (May 16 2019 at 22:45, on Zulip):

note that in some cases this is not known until runtime, e.g.

{
  let x = vec![];
  if something { send(x); }
} // might or might not drop `x`, depending whether `something` was true
nikomatsakis (May 16 2019 at 22:45, on Zulip):

Anyway, as a temporary fix, we could add a similar call to the fact generation, basically only generating a DropUse(X, P) fact if X is initialized at P (using that same function to test)

nikomatsakis (May 16 2019 at 22:45, on Zulip):

But longer term I want polonius to be computing the initialization facts as well

Albin Stjerna (May 17 2019 at 06:51, on Zulip):

@nikomatsakis Hm, I'll have to chew on that one for a while, because it's not entirely trivial to add that logic to the current drop-live generation logic. Maybe it would be simpler to just piggy-back off of trace() in stead of trying to walk the MIR with a Visitor?

nikomatsakis (May 17 2019 at 14:03, on Zulip):

@Albin Stjerna quite likely yes. it would presumably only be temporary until we move that other logic into polonius (whihc I imagine we'll do next)

Albin Stjerna (May 20 2019 at 15:42, on Zulip):

@nikomatsakis

Albin Stjerna quite likely yes. it would presumably only be temporary until we move that other logic into polonius (whihc I imagine we'll do next)

I did as you suggested, but the output is still not the same. It still has region 13 live when it shouldn't be, but several other var_drop_used disappears. It seems that #13 is made live from the presence of "_2" "Mid(bb3[0])" in var_drop_used. However, when I simply remove that from the facts file, I get a mismatch saying that #13 should be live at "Start(bb3[0]) and Mid, but isn't. Honestly, I'm not sure what's going on. At bb3, _8, which is where _2 is moved, is already drop()ped? I don't see how rustc should output a fact then.

Just to be sure, I am recompiling rustc, but I'm sending this now in case you see something I don't.

Albin Stjerna (May 21 2019 at 07:35, on Zulip):

I did it like this though, and it seems to produce equivalent but less output:

if self.cx.initialized_at_terminator(location.block, mpi) {
                if let Some(facts) = self.cx.typeck.borrowck_context.as_mut().unwrap().all_facts {
                    facts
                        .var_drop_used
                        .push((local, self.cx.location_table.mid_index(location)));
                }
Albin Stjerna (May 21 2019 at 07:37, on Zulip):

Gotta love those five levels of wrapping data structures

Albin Stjerna (May 21 2019 at 13:32, on Zulip):

I have another question while I'm at it: I need to extend polonius-parser's grammar with var_uses_region and var_drops_region, but I'm not sure how to do that in the lalrpop syntax

Albin Stjerna (May 21 2019 at 13:33, on Zulip):

I guess it should work kind of like universal_regions, in that it's one global declaration?

Albin Stjerna (May 21 2019 at 13:42, on Zulip):

I figure I should do something like this to begin with:

VarRegionMappings = Comma<VarRegionMapping>;
VarRegionMapping: (String, String) = {
                 "(" <Variable> "," <Region> ")" => (<>),
};

VarUsesRegion = "var_uses_region" "{" <VarRegionMappings> "}";
VarDropsRegion = "var_drops_region" "{" <VarRegionMappings> "}";

But how do I add this in an order-independent way to Input?

lqd (May 21 2019 at 15:35, on Zulip):

don't add it in an order-independent way then :) just putting these after the universal_regions {} will probably be simpler ?

Albin Stjerna (May 21 2019 at 16:05, on Zulip):

don't add it in an order-independent way then :) just putting these after the universal_regions {} will probably be simpler ?

Could you expand on that? I mean, I can do something like:

pub Input: Input = {
    Comment* <universal_regions:UniversalRegions> Comment* <var_uses_region:VarUsesRegion> Comment* <var_drops_region:VarDropsRegion> Comment* <blocks:BlockDefn*> => Input { <> }
};

but that would both require them to be present and in that order.

Albin Stjerna (May 21 2019 at 16:17, on Zulip):

Ok, so I made them optional by adding ?:s, and then creating a constructor for Input like this:

impl Input {
    pub fn new(
        universal_regions: Vec<String>,
        var_uses_region: Option<Vec<(String, String)>>,
        var_drops_region: Option<Vec<(String, String)>>,
        blocks: Vec<Block>,
    ) -> Input {
        Input {
            universal_regions,
            var_uses_region: var_uses_region.unwrap_or(Vec::default()),
            var_drops_region: var_drops_region.unwrap_or(Vec::default()),
            blocks,
        }
    }
}

and using

pub Input: Input = {
    Comment* <universal_regions:UniversalRegions> Comment* <var_uses_region:VarUsesRegion?> Comment* <var_drops_region:VarDropsRegion?> Comment* <blocks:BlockDefn*> => Input::new(<>)
};

in the grammar.

It feels very clunky, is there a better way?

lqd (May 21 2019 at 16:32, on Zulip):

there probably is a better way, using lalrpop better, but even if the order is fixed like so, it's at least acceptable for now ? (and at least unblocks you until then, and at least these test programs have an always consistent syntax albeit inflexible)

lqd (May 21 2019 at 16:34, on Zulip):

there can probably exist a larlpop macro/combinator where we could pass these parts and it would combine them, with optional comments in-between (possibly in an order-independent way) as the program header before the cfg blocks

lqd (May 21 2019 at 16:35, on Zulip):

(similarly to the Comma combinator, at least if one ignores the order-independence part -- update: hmm maybe not)

lqd (May 21 2019 at 16:44, on Zulip):

a lot of the more flexible uses of lalrpop seem to involve a custom lexer eg https://github.com/lalrpop/lalrpop/issues/14 -- then again, I'm not super familiar with it

lqd (May 21 2019 at 17:32, on Zulip):

for an example of order-independence and optional relations, something like this could be done: 1) introduce an enum to unify the relations universal_regions { ... }, var_uses_region { ... }, and var_drops_region { ... } 2) make Input take a list of these "directives" instead of the universal_regions Vec (and either only have a list of directives in Input, or distribute them into your 3 inner Vecs in a new Input::new fn)

lqd (May 21 2019 at 17:32, on Zulip):

like

pub Input: Input = {
    <directives:ProgramDirective*> Comment* <blocks:BlockDefn*> => Input { <> }
};

ProgramDirective : ProgramDirective = {
    Comment* <UniversalRegions> => ProgramDirective::UniversalRegions(<>),
    Comment* <VarUsesRegion> => ProgramDirective::VarUsesRegion(<>),
    Comment* <VarDropsRegion> => ProgramDirective::VarDropsRegion(<>),
}
lqd (May 21 2019 at 17:35, on Zulip):

3) if needed, add some validation to the root parsing fn to ensure these program directives are valid, for example, say, "there must be a universal_regions directive even if it's empty" and the likes which could apply to the other 2 relations

Albin Stjerna (May 22 2019 at 07:13, on Zulip):

Hm, that sounds like a perfectly OK approach too, but I'm not sure it's sufficiently better than what I have to warrant the even more complicated implementation, so I think I'll let it be. At least now all the tests are passing, which is mostly what I was aiming for. :)

Albin Stjerna (May 22 2019 at 07:14, on Zulip):

Ugh, I swear this worked yesterday, but today I'm getting shift-reduce conflicts???

Albin Stjerna (May 22 2019 at 07:21, on Zulip):

Ah, Ok it's the interspersed Comment* between the statements

Albin Stjerna (May 22 2019 at 07:21, on Zulip):

I can...live without that

Albin Stjerna (May 26 2019 at 12:10, on Zulip):

@nikomatsakis Ok I have done a more thorough investigation of what's different between the output of rustc and polonius for liveness (after tweaking the output of which variables get drop-use-emitted when. First, what's going on in the test is that a moved value is explicitly dropped:

    let wrap = Wrap { p: &mut x }; // Wrap impls Drop
    let s = String::from("str");
    let foo = Foo { a: s, b: wrap }; // Foo does not
    std::mem::drop(foo.a);
    std::mem::drop(foo.b);

The interesting MIR blocks are, I think, these:

bb2: {
        FakeRead(ForLet, _5);            // bb2[0]: scope 4 at maybe-initialized-drop-with-uninitialized-fragments.rs:18:9: 18:10
        StorageLive(_6);                 // bb2[1]: scope 6 at maybe-initialized-drop-with-uninitialized-fragments.rs:19:9: 19:12
        StorageLive(_7);                 // bb2[2]: scope 6 at maybe-initialized-drop-with-uninitialized-fragments.rs:19:24: 19:25
        _7 = move _5;                    // bb2[3]: scope 6 at maybe-initialized-drop-with-uninitialized-fragments.rs:19:24: 19:25
        StorageLive(_8);                 // bb2[4]: scope 6 at maybe-initialized-drop-with-uninitialized-fragments.rs:19:30: 19:34
        _8 = move _2;                    // bb2[5]: scope 6 at maybe-initialized-drop-with-uninitialized-fragments.rs:19:30: 19:34
        _6 = Foo::<'_> { a: move _7, b: move _8 }; // bb2[6]: scope 6 at maybe-initialized-drop-with-uninitialized-fragments.rs:19:15: 19:36
        drop(_8) -> [return: bb6, unwind: bb5]; // bb2[7]: scope 6 at maybe-initialized-drop-with-uninitialized-fragments.rs:19:35: 19:36
    }


    bb3 (cleanup): {
        drop(_2) -> bb1;                 // bb3[0]: scope 2 at maybe-initialized-drop-with-uninitialized-fragments.rs:24:1: 24:2
    }

What rustc emits in terms of region liveness for _2/_8's region is that it gets drop-used at the move, Mid(bb2[5]). It's also live in the single line of the bb3 block, but this doesn't propagate to the other blocks that precede it as it should according to the rules.

In other words, I think there is something more advanced related to initialisation tracking going on here, and it's perhaps best to just let it be and fix initialisation tracking in stead? It's worth noting that leaving it as it is doesn't break any compile-tests.

Albin Stjerna (May 26 2019 at 12:14, on Zulip):

I should perhaps also mention that _8, which is the moved copy of _2gets a new region. This is then solved by a mutual outlives relationship that of course means is equal, but we don't know that when generating region_live_at.

Albin Stjerna (May 26 2019 at 12:16, on Zulip):

I'm not sure if there's a case to be made for inital de-duplication of these equivalent regions or if you're already doing that

Albin Stjerna (May 26 2019 at 14:23, on Zulip):

@nikomatsakis I also have a theoretical question. In the literature, there is a lot of talk about "context-sensitive analysis", but a "context" seems very broad. Given that Polonius takes literal lines of intermediate representation into account, does that make the analysis context-sensitive?

Albin Stjerna (May 26 2019 at 15:36, on Zulip):

@nikomatsakis Also, I am a bit unclear about what precisely happens when we generate var_uses_regionand var_drops_region. What precisely does it mean that a given region occurs in a referece's type? How does this interact with, say, indirect/nested references?

Matthew Jasper (May 26 2019 at 15:59, on Zulip):

A region occurs in a type if it

Matthew Jasper (May 26 2019 at 16:05, on Zulip):

In other words, I think there is something more advanced related to initialisation tracking going on here, and it's perhaps best to just let it be and fix initialisation tracking in stead? It's worth noting that leaving it as it is doesn't break any compile-tests

Liveness in NLL does use initialization results to reduce the amount of time that variable are live. I think that this can just be left for now.
We should have a test for this in the rustc test suite, but it's possible that polonius circumvents the test.

Matthew Jasper (May 26 2019 at 16:06, on Zulip):

Are you generating var_uses_region and var_drops_region facts at the moment?

Albin Stjerna (May 27 2019 at 10:50, on Zulip):

Liveness in NLL does use initialization results to reduce the amount of time that variable are live. I think that this can just be left for now.
We should have a test for this in the rustc test suite, but it's possible that polonius circumvents the test

I'm talking about the Polonius-internal tests that only verify that Polonius region_live_atis the same as the one given by Rustc (which it currently isn't always). I wrote those. :)

Albin Stjerna (May 27 2019 at 10:53, on Zulip):

@Matthew Jasper

Are you generating var_uses_region and var_drops_region facts at the moment?

Yes! In trace::add_drop_live_facts_for, I have:

       // All things in the `outlives` array may be touched by
        // the destructor and must be live at this point.
        for &kind in &drop_data.dropck_result.kinds {
            Self::make_all_regions_live(self.elements, &mut self.typeck, kind, live_at);

            polonius::add_var_drops_regions(&mut self.typeck, dropped_local, &kind);
        }

and then:

// For every potentially drop()-touched region `region` in `local`'s type
// (`kind`), emit a Polonius `var_drops_region(local, region)` fact.
pub(super) fn add_var_drops_regions(
    typeck: &mut TypeChecker<'_, '_, 'tcx>,
    local: Local,
    kind: &Kind<'tcx>,
) {
    let tcx = typeck.tcx();

    tcx.for_each_free_region(kind, |drop_live_region| {
        let region_vid = typeck.borrowck_context.universal_regions.to_region_vid(drop_live_region);
        if let Some(facts) = typeck.borrowck_context.all_facts.as_mut() {
            facts.var_drops_region.push((local, region_vid));
        };
    });
}
Albin Stjerna (May 27 2019 at 10:59, on Zulip):

A region occurs in a type if it

OK, so the latter one takes care of the reference to reference situation, where we might have a reference to a reference to a reference with region 'x. Got it!

nikomatsakis (May 28 2019 at 13:46, on Zulip):

@Albin Stjerna I'm not entirely clear on which questions remain unresolved here --

nikomatsakis (May 28 2019 at 13:46, on Zulip):

as @Matthew Jasper said, we do take initialization into account, I think you and I discussed this earlier. When it comes to polonius, I would probably prefer to wait on that part because computing "initialization" is in fact the next thing we should be doing (perhaps starting this week?) and so it would make sense to incorporate that afterwards.

nikomatsakis (May 28 2019 at 13:47, on Zulip):

Probably you and me should schedule some kind of (recorded) zoom session to talk over the next steps

nikomatsakis (May 28 2019 at 13:48, on Zulip):

(re: parsing, the way to do this would be to accept the directives in any order, and then enforce any "occurs at most once" restrictions in the action code, I suppose)

Albin Stjerna (May 28 2019 at 15:05, on Zulip):

(re: parsing, the way to do this would be to accept the directives in any order, and then enforce any "occurs at most once" restrictions in the action code, I suppose)

I pushed a stupid version that enforces ordering for now, but that's probably a better long-term solution.

Albin Stjerna (May 28 2019 at 15:10, on Zulip):

Albin Stjerna I'm not entirely clear on which questions remain unresolved here --

Errm, hardly surprising. I think you can treat most of this as resolved for now. What seems to have happened is that I figured that for some edge cases we will probably need proper initialisation analysis in Polonius to get the same region_live_at output as rustc.

I think we should be starting with initialisation this week, and I guess we should start merging the liveness logic as-is?

nikomatsakis (May 28 2019 at 15:14, on Zulip):

sounds correct to me

nikomatsakis (May 28 2019 at 15:14, on Zulip):

I guess we need to merge the polonius code first?

nikomatsakis (May 28 2019 at 15:15, on Zulip):

well, we can sync up later today I guess and figure this out, I'm kind of busy until then

Albin Stjerna (May 28 2019 at 17:25, on Zulip):

@nikomatsakis Sounds good; I was busy too

Albin Stjerna (May 28 2019 at 17:26, on Zulip):

Also, this is what the liveness graph output I'm working on looks like now

Albin Stjerna (May 28 2019 at 17:27, on Zulip):

I'm planning on giving the liveness edges different colours for each variable, but err this is a work-in-progress

Albin Stjerna (May 28 2019 at 17:27, on Zulip):

Also, reduction of non-interesting nodes is sorely needed as you can see

Albin Stjerna (May 30 2019 at 10:45, on Zulip):

now I have the facts as well

Albin Stjerna (May 31 2019 at 09:06, on Zulip):

Ok; now the full graph reduction thing works, which means that it joins nodes in the graph where liveness doesn't change and control flow just moves forward: example and other example

Albin Stjerna (May 31 2019 at 09:07, on Zulip):

I had to pull in the petgraph library for DFS though, and do a few...less than beautiful things

Albin Stjerna (May 31 2019 at 14:41, on Zulip):

Update: now with rainbows.

Albin Stjerna (May 31 2019 at 14:43, on Zulip):

Also, emoji to save space. :drop: = var_drop_used, :fixing: = var_used, :skull_and_crossbones:️ = var_defined

Albin Stjerna (Jun 01 2019 at 08:15, on Zulip):

@WG-polonius Would it be OK to pull in petgraph as a dependency in polonius-the-binary (not polonius-engine)?

lqd (Jun 01 2019 at 08:20, on Zulip):

for the binary I myself think it’d be ok

nikomatsakis (Jun 01 2019 at 11:31, on Zulip):

for the binary I myself think it’d be ok

yes, for polonius it's ok, not polonius-engine

Albin Stjerna (Jun 01 2019 at 11:51, on Zulip):

Good, that's what I thought

Albin Stjerna (Jun 01 2019 at 11:51, on Zulip):

Then I'll make a separate PR of those after the liveness PR is merged

Albin Stjerna (Jun 01 2019 at 11:52, on Zulip):

Because I genuinely think it might be useful for debugging

nikomatsakis (Jun 03 2019 at 21:16, on Zulip):

@Albin Stjerna is https://github.com/rust-lang/polonius/pull/105/ ready for review / ready to land?

nikomatsakis (Jun 03 2019 at 21:17, on Zulip):

hmm that's a painful one to review :)

lqd (Jun 03 2019 at 21:19, on Zulip):

is the +1,347,366 −1,345,277 expected btw ?

lqd (Jun 03 2019 at 21:20, on Zulip):

(seems a bit high, even if modifying many input facts)

nikomatsakis (Jun 03 2019 at 21:24, on Zulip):

not sure :)

lqd (Jun 04 2019 at 09:09, on Zulip):

yeah, seems like all the facts were regenerated, I'm surprised they are so different to cause such a big diff — but I guess it makes sense since the ones in the repo are fairly old (and the "NLL MIR" has changed since they were initially generated) but I hope they're not regressing. For example, as clap is not unit tested, it would be nice to be sure the new facts produce the same errors as the old ones. It's a bit of a chicken and egg situation, as rustc and polonius bootstrap each other here, creating big diffs for the facts especially over a longer period of time. I also wonder how much of rustc's tests should we integrate in Polonius in general (vs rustc's polonius compare mode), and under the form of facts in particular (vs, say, generating them just in time in the tests, in memory or .gitignored).

Albin Stjerna (Jun 04 2019 at 09:46, on Zulip):

@nikomatsakis It should be ready to merge, except for an entry in the Changelog, but I plan on writing that from the circulation desk later today. I'll let you know when it's done!

Albin Stjerna (Jun 04 2019 at 09:46, on Zulip):

is the +1,347,366 −1,345,277 expected btw ?

@lqd I'm also adding new input facts, but yes, in hindsight this maybe wasn't a great idea.

Albin Stjerna (Jun 04 2019 at 09:49, on Zulip):

yeah, seems like all the facts were regenerated, I'm surprised they are so different to cause such a big diff — but I guess it makes sense since the ones in the repo are fairly old (and the "NLL MIR" has changed since they were initially generated) but I hope they're not regressing.

Almost all of them, some of the facts caused hard-coded point IDs to change in the MIR and I kept the old facts. This also means that the var_used_atand all the other new facts are actually inconsistent with the other facts in the same folder, but it's fine as they are ignored and the liveness analysis bypassed when there is an existing region_live_atfact set.

Albin Stjerna (Jun 04 2019 at 09:50, on Zulip):

But yes, in the future I think I should keep the PRs smaller and the merge windows shorter than this, in particular for rustc

lqd (Jun 04 2019 at 09:51, on Zulip):

chicken and egg, sometimes you need new facts for tests, or to update an existing one such as clap (which is likely be the vast majority of the change here) it’s a bit unfortunate but I’m not sure it would have been easy/possible to avoid for you in any case

lqd (Jun 04 2019 at 09:53, on Zulip):

could be an interesting topic for tonight ?

Albin Stjerna (Jun 04 2019 at 09:55, on Zulip):

Yes!

Albin Stjerna (Jun 04 2019 at 11:40, on Zulip):

@nikomatsakis Ok DONE, I think. I just updated RELEASES on my branch

nikomatsakis (Jun 04 2019 at 15:51, on Zulip):

@Albin Stjerna thanks

Albin Stjerna (Jun 04 2019 at 20:21, on Zulip):

@nikomatsakis Should I do something similar to my commits to Rust?

Albin Stjerna (Jun 04 2019 at 20:22, on Zulip):

It's less messy than Polonius, but I didn't go out of my way to curate the history as the commits seem to get squashed on merge anyways?

nikomatsakis (Jun 04 2019 at 21:08, on Zulip):

yes please

nikomatsakis (Jun 04 2019 at 21:08, on Zulip):

@Albin Stjerna I haven't actually looked but a messy history can be a real pain

nikomatsakis (Jun 04 2019 at 21:08, on Zulip):

of course if I just view "all diffs" it's fine

nikomatsakis (Jun 04 2019 at 21:08, on Zulip):

it mostly matters if the full set of diffs is large

Albin Stjerna (Jun 05 2019 at 07:04, on Zulip):

@nikomatsakis I take that back, I think the commit log looks all right except that I still regret that rustfmt a bit

lqd (Jun 05 2019 at 07:58, on Zulip):

polonius#105 seems to be missing the liveness.rs file

nikomatsakis (Jun 05 2019 at 13:41, on Zulip):

@Albin Stjerna :point_up:

Albin Stjerna (Jun 05 2019 at 14:21, on Zulip):

Err wooops

Albin Stjerna (Jun 05 2019 at 14:23, on Zulip):

@nikomatsakis Fixed!

nikomatsakis (Jun 05 2019 at 14:24, on Zulip):

I :heart: the "boring" and "fun" commit btw :)

Albin Stjerna (Jun 05 2019 at 14:26, on Zulip):

The naming or the general concept?

nikomatsakis (Jun 07 2019 at 19:50, on Zulip):

@Albin Stjerna merged, published

Albin Stjerna (Jun 07 2019 at 19:50, on Zulip):

@nikomatsakis Yay! I'll finalise rustc now :)

lqd (Jun 07 2019 at 20:06, on Zulip):

about that, I was thinking, since we don't quite match rustc's region_live_at generation yet, maybe we should only remove this relation facts' generation from rustc after move/overwrite has been integrated, so we can then ensure we do match it ? I mean, maybe we don't do this commit yet, what do you think ? (or maybe it's not that easy/worthwhile ?)

Albin Stjerna (Jun 07 2019 at 20:13, on Zulip):

Hmm. It does make sense I guess, but we also have the recorded facts to compare with, but those could of course go stale

Albin Stjerna (Jun 07 2019 at 20:14, on Zulip):

Other than that, rust#60266 should be ready to merge

lqd (Jun 07 2019 at 20:14, on Zulip):

true; just a thought as it's hard to check correctness as you've no doubt seen :)

Albin Stjerna (Jun 07 2019 at 20:15, on Zulip):

Err, yes

Albin Stjerna (Jun 07 2019 at 20:15, on Zulip):

Ideally I'd like to have an option to turn it on, but I don't know the policy on that

lqd (Jun 07 2019 at 20:19, on Zulip):

initialization/move/overwrite shouldn't take close as long, so maybe the new recorded facts would be enough indeed

lqd (Jun 07 2019 at 20:21, on Zulip):

I think (but unsure about the procedure about triage) that the PR can also be marked S-waiting-on-review instead of S-waiting-on-author

simulacrum (Jun 07 2019 at 20:22, on Zulip):

Feel free to remark as waiting-on-review if the author feels that way

simulacrum (Jun 07 2019 at 20:22, on Zulip):

basically if the author is, in fact, waiting on review then that's the right status :)

lqd (Jun 07 2019 at 20:22, on Zulip):

:)

lqd (Jun 07 2019 at 20:23, on Zulip):

done @Albin Stjerna — thanks @simulacrum :)

nikomatsakis (Jun 10 2019 at 18:37, on Zulip):

Can we emit both sets of facts and have polonius configurable as to whether or not to use them?

nikomatsakis (Jun 10 2019 at 18:37, on Zulip):

e.g. by selecting which algorithm perhaps?

nikomatsakis (Jun 10 2019 at 18:37, on Zulip):

naive, naive-liveness, something like that

lqd (Jun 10 2019 at 19:25, on Zulip):

sounds feasible

nikomatsakis (Jun 10 2019 at 20:56, on Zulip):

Hmm, @Albin Stjerna, I just skimmed https://github.com/rust-lang/rust/pull/60266/ -- I'm trying to remember -- that code is (at least sort of) taking initialization into account when generating var_drop_used facts, though it's leveraging rustc to do that generation ... but we are still seeing discrepancies in the results? I can't remember if we tracked that down or not.

Albin Stjerna (Jun 10 2019 at 21:28, on Zulip):

e.g. by selecting which algorithm perhaps?

The easiest way is just to either not populate the region_live_at input or delete it to begin with I think, but yes, that's not difficult.

Albin Stjerna (Jun 10 2019 at 21:28, on Zulip):

Hmm, Albin Stjerna, I just skimmed https://github.com/rust-lang/rust/pull/60266/ -- I'm trying to remember -- that code is (at least sort of) taking initialization into account when generating var_drop_used facts, though it's leveraging rustc to do that generation ... but we are still seeing discrepancies in the results? I can't remember if we tracked that down or not.

Yes, and I know. It's weird. I did a run-down of the differences, and they really were minor, but I don't understand it either.

nikomatsakis (Jun 10 2019 at 21:29, on Zulip):

Hmm, ok. I wonder if we should try to dig into this a bit deeper.

nikomatsakis (Jun 10 2019 at 21:29, on Zulip):

(Have you taken a look at the other initialization-tracking code at all?)

Albin Stjerna (Jun 10 2019 at 21:30, on Zulip):

Not yet, I have been working on my presentation. And also just been very inefficient lately, I honestly don't know why

nikomatsakis (Jun 10 2019 at 21:31, on Zulip):

everything takes longer than you think...

Albin Stjerna (Jun 10 2019 at 21:34, on Zulip):

Anyway, my notes say that in the example I looked at, it seems that there was just one variable being needlessly live

Albin Stjerna (Jun 10 2019 at 21:36, on Zulip):

It was made live from a drop-use in an unwind block. If you in stead change the logic to consider it drop-used when it's moved, and then drop-live again only in the point where it was dropped but with no propagation (despite no reassignments killing it), results are the same

Albin Stjerna (Jun 10 2019 at 21:37, on Zulip):

In other words, I think the logic for region_live_at in rustc is not doing the flow calculations but rather something different

lqd (Jun 11 2019 at 10:18, on Zulip):

(I have not looked at the liveness PRs so the following may be wrong) this reminds me of #60840 and #61373 about adding (adding back?) more StorageDead for some locals in unwind paths: it may be related if our fact generation used to rely on those and an optimization has removed some ? (since it seems to be a big perf regression to add them everywhere and in those PRs they had to be limited to generators)

nikomatsakis (Jun 11 2019 at 18:15, on Zulip):

In other words, I think the logic for region_live_at in rustc is not doing the flow calculations but rather something different

let me double-check that

nikomatsakis (Jun 11 2019 at 19:21, on Zulip):

Anyway, my notes say that in the example I looked at, it seems that there was just one variable being needlessly live

can you say a bit more about this?

Albin Stjerna (Jun 11 2019 at 19:22, on Zulip):

Yes, let me dig up my notes

Albin Stjerna (Jun 11 2019 at 19:23, on Zulip):

Ok, so I looked at maybe-initialized-drop-with-uninitialized-fragments, main I think

nikomatsakis (Jun 11 2019 at 19:25, on Zulip):

OK

nikomatsakis (Jun 11 2019 at 19:26, on Zulip):

I'm looking at it :)

Albin Stjerna (Jun 11 2019 at 19:26, on Zulip):

And the difference is that provenance variable 13 is live at a number of blocks when it shouldn't be

Albin Stjerna (Jun 11 2019 at 19:26, on Zulip):

It's made live by a drop-use of _2 in bb3[0]

Albin Stjerna (Jun 11 2019 at 19:27, on Zulip):

Which is an unwind, I think

nikomatsakis (Jun 11 2019 at 19:27, on Zulip):

hmm

nikomatsakis (Jun 11 2019 at 19:27, on Zulip):

well let me dump the MIR locally

Albin Stjerna (Jun 11 2019 at 19:28, on Zulip):

Hm, no sorry it's a cleanup block which may or may not be an unwind

Albin Stjerna (Jun 11 2019 at 19:28, on Zulip):

I'm not sure

nikomatsakis (Jun 11 2019 at 19:28, on Zulip):

ok so _2 is the Wrap<'_>

nikomatsakis (Jun 11 2019 at 19:28, on Zulip):

the example:

fn main() {
    let mut x = 0;
    let wrap = Wrap { p: &mut x };
    let s = String::from("str");
    let foo = Foo { a: s, b: wrap };
    std::mem::drop(foo.a);
    std::mem::drop(foo.b);
    x = 1; //~ ERROR cannot assign to `x` because it is borrowed [E0506]
    // FIXME ^ This currently errors and it should not.
}
nikomatsakis (Jun 11 2019 at 19:28, on Zulip):

the interesting bit here: wrap is moved into foo

nikomatsakis (Jun 11 2019 at 19:29, on Zulip):

we then move out all the parts of foo, but we never move from foo itself

nikomatsakis (Jun 11 2019 at 19:29, on Zulip):

the current NLL fails to understand this

nikomatsakis (Jun 11 2019 at 19:29, on Zulip):

so it assumes that the foo dtor could still run

nikomatsakis (Jun 11 2019 at 19:29, on Zulip):

which makes the borrow of &mut x live

nikomatsakis (Jun 11 2019 at 19:32, on Zulip):

anyway yes ok so I see the unwind problem

Albin Stjerna (Jun 11 2019 at 19:32, on Zulip):

I noted that the following changes should give the same output in Polonius:
- mark _2 as drop-used and drop-live at the point of the move (Mid(bb2[5]))
- mark _2 as drop-live in bb3[0], but don't propagate that backwards (so, I guess, also kill/assign it it)
- mark the move as a kill (assignment)

nikomatsakis (Jun 11 2019 at 19:32, on Zulip):

basically _2 is potentially live at _3

Albin Stjerna (Jun 11 2019 at 19:32, on Zulip):

Ah

nikomatsakis (Jun 11 2019 at 19:32, on Zulip):

the idea is this:

nikomatsakis (Jun 11 2019 at 19:32, on Zulip):

if String::from should panic (say)

nikomatsakis (Jun 11 2019 at 19:33, on Zulip):

it would unwind and drop wrap

nikomatsakis (Jun 11 2019 at 19:33, on Zulip):

but now, later on, if std::mem::drop should panic

nikomatsakis (Jun 11 2019 at 19:33, on Zulip):

we unwind through the same frame

nikomatsakis (Jun 11 2019 at 19:33, on Zulip):

of course, at that point, wrap will have been moved (dynamically)

nikomatsakis (Jun 11 2019 at 19:33, on Zulip):

so now I have to remember where in the code

nikomatsakis (Jun 11 2019 at 19:33, on Zulip):

...this is sort of coming back to me..

nikomatsakis (Jun 11 2019 at 19:33, on Zulip):

do we account for this and where

Albin Stjerna (Jun 11 2019 at 19:34, on Zulip):

Ah

Albin Stjerna (Jun 11 2019 at 19:38, on Zulip):

(I'm currently staring at the liveness graph I generated by the extension I made to try to figure out what is going on)

Albin Stjerna (Jun 11 2019 at 19:39, on Zulip):

which makes the borrow of &mut x live

Hm, but the problem I'm having is that Polonius is reporting to many live variables, not too few

nikomatsakis (Jun 11 2019 at 19:39, on Zulip):

it looks @Albin Stjerna like we are integrating the initialization check deeply into the propagation

nikomatsakis (Jun 11 2019 at 19:39, on Zulip):

so I guess your original analysis was correct

nikomatsakis (Jun 11 2019 at 19:39, on Zulip):

notably

nikomatsakis (Jun 11 2019 at 19:40, on Zulip):

in this code, we decide to propagate back from a node N to its predecessors only if the path is initialized at the exit of the predecessor

nikomatsakis (Jun 11 2019 at 19:40, on Zulip):

one option, it occurs to me

nikomatsakis (Jun 11 2019 at 19:40, on Zulip):

if we wanted to remove region_live_at without loss of precision

nikomatsakis (Jun 11 2019 at 19:41, on Zulip):

would be to emit facts for initialized_at_exit

nikomatsakis (Jun 11 2019 at 19:41, on Zulip):

i.e., we can emit facts for the stuff that polonius will compute later

nikomatsakis (Jun 11 2019 at 19:42, on Zulip):

I guess that this logic would change:

var_drop_live(V, P) :-
    var_drop_live(V, Q),
    cfg_edge(P, Q),
    !var_defined(V, P).
nikomatsakis (Jun 11 2019 at 19:43, on Zulip):

we want something like

var_drop_live(V, P) :-
    var_drop_live(V, Q),
    cfg_edge(P, Q),
    !var_defined(V, P),
    var_initialized_on_exit(V, P).
lqd (Jun 11 2019 at 19:45, on Zulip):

that's an interesting step

Albin Stjerna (Jun 11 2019 at 19:47, on Zulip):

@nikomatsakis oh ok, so we cheat by emitting initialisation facts in rust, the same ones we will compute in the next step, and use those as a crutch to make sure we will eventually get the right output?

nikomatsakis (Jun 11 2019 at 19:48, on Zulip):

right

Albin Stjerna (Jun 11 2019 at 19:48, on Zulip):

oh, that's also good in a second way

Albin Stjerna (Jun 11 2019 at 19:48, on Zulip):

because that gives us free validation when we roll out initialisation in Polonius

Albin Stjerna (Jun 11 2019 at 19:49, on Zulip):

in precisely the same way as for region_live_at

Albin Stjerna (Jun 11 2019 at 19:49, on Zulip):

ok, sold!

nikomatsakis (Jun 11 2019 at 19:51, on Zulip):

yeah that's kind of nice

nikomatsakis (Jun 11 2019 at 19:51, on Zulip):

it would definitely be good to transition fully to polonius for liveness

nikomatsakis (Jun 11 2019 at 19:51, on Zulip):

and not have some intermediate state where we sometimes use rustc's liveness

Albin Stjerna (Jun 11 2019 at 19:54, on Zulip):

Yes, that would be nice. But initialisation tracking is the last step for that right?

nikomatsakis (Jun 11 2019 at 19:54, on Zulip):

I'm saying that if we add these facts for now

nikomatsakis (Jun 11 2019 at 19:55, on Zulip):

we are able to kill the old liveness completely

nikomatsakis (Jun 11 2019 at 19:55, on Zulip):

and then we transition from those to doing the init tracking ourselves

Albin Stjerna (Jun 11 2019 at 19:55, on Zulip):

Ah, ok

Albin Stjerna (Jun 11 2019 at 19:55, on Zulip):

I thought I already did kill the liveness tracking in my branch?

Albin Stjerna (Jun 11 2019 at 19:55, on Zulip):

Or was that what you were referring to?

nikomatsakis (Jun 11 2019 at 20:12, on Zulip):

yes, you did, it's just that we don't get the exact same results right now, and @lqd was concerned about that (for testing purposes)

Albin Stjerna (Jun 11 2019 at 20:13, on Zulip):

Ah ok, then I understand how everything fits. And I agree that it would feel much, much better to have the same output

Albin Stjerna (Jun 11 2019 at 20:13, on Zulip):

Until we have implemented this I think it's all just a hypothesis anyway

Albin Stjerna (Jun 11 2019 at 20:15, on Zulip):

I need to go to bed now, but I'll go through your notes on initialization and see where that gets me in terms of implementing var_initialized_on_exit

Albin Stjerna (Jun 13 2019 at 15:31, on Zulip):

@nikomatsakis Ok, the liveness-debug-graph-generation is now available as a mergable (hopefully) PR: polonius#108.

nikomatsakis (Jun 13 2019 at 16:22, on Zulip):

@Albin Stjerna cool!

nikomatsakis (Jun 13 2019 at 16:22, on Zulip):

Regarding the overall status, am I correct that you are attempting to integrate the init information into the liveness analysis as we discussed, to eliminate the remaining discrepancy?

Albin Stjerna (Jun 13 2019 at 16:24, on Zulip):

@nikomatsakis Yes! I started working on that today

Albin Stjerna (Jun 13 2019 at 16:27, on Zulip):

I'm not sure how to go from the maybe initialized information in rust to actually knowing if something is initialized or not, but I guess the fact should be emitted only if we absolutely know that a variable is initialized?

Albin Stjerna (Jun 13 2019 at 16:27, on Zulip):

(or have been deinitialized)

lqd (Jun 13 2019 at 17:31, on Zulip):

btw the description from #54468 mentions some limitation about the invalidation/drop facts, would that impact liveness (or our complete regenerating of the recorded facts) ? -- (of course, some of these limitations, the invalidation ones or 2PBs, are likely linked to the test failures we're seeing in compare mode)

Albin Stjerna (Jun 14 2019 at 10:02, on Zulip):

Ok, so I have an idea to inject the code to generate var_initialized_on_exit facts in trace() (where I know that information already exists), but it seems I would need to iterate over every location in the MIR and basically ask "is V initialized here?". Now, I can't find an easy way of iterating over all locations in the MIR, and also this feels like an unnecessarily slow way of going about this

Albin Stjerna (Jun 14 2019 at 10:18, on Zulip):

Oh no that seems to require a MIR visitor

Albin Stjerna (Jun 14 2019 at 10:40, on Zulip):

Hmm, or should we just emit the facts for the exit point of a given BasicBlock? Also, should this fact be emitted even on partial initialization?

Albin Stjerna (Jun 14 2019 at 10:43, on Zulip):

Or should var rather be a move path with a separate relation mapping variables to their move paths?

Albin Stjerna (Jun 14 2019 at 10:44, on Zulip):

Hm, but if we use this analysis only for drop-liveness (for now), then I guess we should track any initialisation at all, right?

nikomatsakis (Jun 14 2019 at 11:31, on Zulip):

I'm not sure how to go from the maybe initialized information in rust to actually knowing if something is initialized or not, but I guess the fact should be emitted only if we absolutely know that a variable is initialized?

well "maybe initialized" is actually what we want, I think? That is, we want to halt propagation if we know that something is definitely uninitialized, but not otherwise.

nikomatsakis (Jun 14 2019 at 11:32, on Zulip):

Hmm, or should we just emit the facts for the exit point of a given BasicBlock? Also, should this fact be emitted even on partial initialization?

in practice, we only care about the exit of a block at the moment, so that suffices I guess

nikomatsakis (Jun 14 2019 at 11:33, on Zulip):

Or should var rather be a move path with a separate relation mapping variables to their move paths?

I think we could just use variables because that is the basis of the liveness computation -- the real initialization logic will want to use sub-paths

nikomatsakis (Jun 14 2019 at 11:34, on Zulip):

(iterating over all locations in the MIR is certainly possible)

Albin Stjerna (Jun 14 2019 at 11:40, on Zulip):

(iterating over all locations in the MIR is certainly possible)

Ok, I'll keep looking for that in the documentation then, because I think this should work otherwise.

Albin Stjerna (Jun 14 2019 at 11:41, on Zulip):

What is the difference between something being initialised at a terminator compared to at the exit of a block? Shouldn't they be the same?

Albin Stjerna (Jun 14 2019 at 11:47, on Zulip):

Hmh, so it does some sort of recomputation:

// Compute the set of initialized paths at terminator of block
// by resetting to the start of the block and then applying
// the effects of all statements. This is the only way to get
// "just ahead" of a terminator.

Hm, basically both methods check flow_inits, to see if any part of the variable's move path is still initialised, but at_terminator re-performs the effects of each statement in the block, and at_exit just returns whatever's in flow_initsat the exit point.

Albin Stjerna (Jun 14 2019 at 11:50, on Zulip):

(feel free to throttle messages because I might be rubber-ducking you right now)

Albin Stjerna (Jun 14 2019 at 11:50, on Zulip):

that's also why I'm not @:ing anyone

Albin Stjerna (Jun 14 2019 at 12:35, on Zulip):

Ok I think I figured it out now

Albin Stjerna (Jun 14 2019 at 12:35, on Zulip):

Let's see if it works

Albin Stjerna (Jun 14 2019 at 12:36, on Zulip):

correction, let's see if it runs

nikomatsakis (Jun 14 2019 at 22:11, on Zulip):

@Albin Stjerna sorry was afk-ish :)

nikomatsakis (Jun 14 2019 at 22:11, on Zulip):

what is at_exit?

nikomatsakis (Jun 14 2019 at 22:11, on Zulip):

:)

nikomatsakis (Jun 14 2019 at 22:12, on Zulip):

oh, you mean the initialized_at_exit function

nikomatsakis (Jun 14 2019 at 22:12, on Zulip):

right so

nikomatsakis (Jun 14 2019 at 22:12, on Zulip):

here's the thing

nikomatsakis (Jun 14 2019 at 22:12, on Zulip):

the terminator of a MIR block can make assignments (in the case of a function call)

nikomatsakis (Jun 14 2019 at 22:12, on Zulip):

e.g.

nikomatsakis (Jun 14 2019 at 22:13, on Zulip):
BB0 {
    X = call foo(A, B, C) goto BB1 unwind BB2
}
nikomatsakis (Jun 14 2019 at 22:13, on Zulip):

well I forget how we pretty-print it

nikomatsakis (Jun 14 2019 at 22:13, on Zulip):

so the at_exit call is checking the effect after the terminator executes -- meaning that A, B, and C would have been consumed

nikomatsakis (Jun 14 2019 at 22:13, on Zulip):

but at_terminator is checking the state just before the terminator executes -- so A..C would not yet have been consumed

nikomatsakis (Jun 14 2019 at 22:14, on Zulip):

the subtlety around calls that at_exit is referring to is the fact that the variable X

nikomatsakis (Jun 14 2019 at 22:14, on Zulip):

is only going to be initialized if we enter BB1 -- i.e., on a normal return

nikomatsakis (Jun 14 2019 at 22:14, on Zulip):

if an unwind occurs (BB2) then it would not be initialized

nikomatsakis (Jun 14 2019 at 22:16, on Zulip):

I think at_exit will return a value where X is not yet initialized

Albin Stjerna (Jul 03 2019 at 11:56, on Zulip):

@nikomatsakis Ok, polonius#109 should be ready to merge now!

Albin Stjerna (Jul 03 2019 at 12:42, on Zulip):

Really love these two-hour merges every time

Albin Stjerna (Jul 03 2019 at 12:42, on Zulip):

Should be trivial, but never are

Albin Stjerna (Jul 03 2019 at 12:48, on Zulip):

Apparently, the rebase...somehow reverts changes that it never shows merge conflicts for

Albin Stjerna (Jul 03 2019 at 12:48, on Zulip):

I don't understand how that is even possible

Albin Stjerna (Jul 03 2019 at 12:55, on Zulip):

Specifically, commit 0d2b8d5cfd8seems to just change to undo upstream

Albin Stjerna (Jul 03 2019 at 13:54, on Zulip):

(deleted)

Albin Stjerna (Jul 03 2019 at 13:57, on Zulip):

Ah, ok, I see. It's a rustfmt run that also...replaces all the rustfmt:ed lines.

Albin Stjerna (Jul 03 2019 at 13:58, on Zulip):

It seems to be this commit: https://github.com/rust-lang/rust/pull/60266/commits/494b6914c5f61df180ce652752430e5ff4367048

Albin Stjerna (Jul 03 2019 at 13:59, on Zulip):

I actually have no idea how to solve this

Albin Stjerna (Jul 03 2019 at 14:47, on Zulip):

Yessss Ok I got it to work now

Albin Stjerna (Jul 03 2019 at 14:47, on Zulip):

After THREE HOURS

Albin Stjerna (Jul 03 2019 at 14:52, on Zulip):

Ok, waiting on the new Polonius release now!

Albin Stjerna (Jul 03 2019 at 17:50, on Zulip):

Haha, the newly rebased rustc will produce different basic block IDs

Albin Stjerna (Jul 03 2019 at 17:50, on Zulip):

In other words; almost entirely different input facts

Albin Stjerna (Jul 03 2019 at 17:51, on Zulip):

(still passing all tests though)

nikomatsakis (Jul 08 2019 at 18:34, on Zulip):

nikomatsakis Ok, polonius#109 should be ready to merge now!

awesome!

nikomatsakis (Jul 08 2019 at 18:35, on Zulip):

OK, I merged polonius#109, @Albin Stjerna I'll publish a new version

nikomatsakis (Jul 09 2019 at 10:11, on Zulip):

@Albin Stjerna so https://github.com/rust-lang/rust/pull/60266 is failing the PR checks, but I can't get the logs, maybe spurious?

Albin Stjerna (Jul 09 2019 at 10:25, on Zulip):

Albin Stjerna so https://github.com/rust-lang/rust/pull/60266 is failing the PR checks, but I can't get the logs, maybe spurious?

I think it's because I haven't adapted it to the newly released Polonius yet, I'll do that...now!

Albin Stjerna (Jul 09 2019 at 10:36, on Zulip):

@nikomatsakis Ok, now rust#60266 should build, I just pushed a commit upgrading Polonius!

lqd (Jul 09 2019 at 14:05, on Zulip):

the PR check is green :thumbs_up:

nikomatsakis (Jul 12 2019 at 13:57, on Zulip):

@Albin Stjerna needs rebase :(

nikomatsakis (Jul 12 2019 at 13:57, on Zulip):

also, maybe worth squashing the commits somewhat?

nikomatsakis (Jul 12 2019 at 13:57, on Zulip):

I found it kind of hard to re-review as is

Albin Stjerna (Jul 12 2019 at 13:58, on Zulip):

@nikomatsakis I saw, but too late unfortunately:(

Albin Stjerna (Jul 12 2019 at 13:58, on Zulip):

But yes it’s getting a bit messy

Albin Stjerna (Jul 12 2019 at 14:02, on Zulip):

I’ll try squashing it when I’m rebasing

Albin Stjerna (Jul 12 2019 at 20:56, on Zulip):

@nikomatsakis Ok I went whole-hog and just rewrote the entire history a lá my previous fun/boring commits to Polonius. I must say I enjoy having to learn more about git, and I have absolutely come to love magit at this point.

nikomatsakis (Jul 12 2019 at 20:56, on Zulip):

magit is life changing

nikomatsakis (Jul 12 2019 at 20:56, on Zulip):

it is also the most emacs of all interfaces

nikomatsakis (Jul 12 2019 at 20:56, on Zulip):

and I don't mean that in a good way

Albin Stjerna (Jul 12 2019 at 20:58, on Zulip):

Yeah it's slow for no reason and very obviously interacts flakily with a shell script

Albin Stjerna (Jul 12 2019 at 20:58, on Zulip):

I have/love emacs so much

Albin Stjerna (Jul 12 2019 at 20:58, on Zulip):

It's like LaTeX

Albin Stjerna (Jul 12 2019 at 20:58, on Zulip):

It's great at what it does and also awful at what it does and it should be replaced, but the path dependency on it is just awful so that will never happen

lqd (Jul 12 2019 at 22:57, on Zulip):

if it's just rebasing and doesn't need re-review, which I assume is the case, we can tell bors r=niko when CI is green (if so, I'll do that in the morning)

Albin Stjerna (Jul 13 2019 at 06:31, on Zulip):

@lqd It is and it shouldn't

Albin Stjerna (Jul 13 2019 at 06:31, on Zulip):

Can I do that?

Albin Stjerna (Jul 13 2019 at 06:31, on Zulip):

I'm still not sure what I can do

lqd (Jul 13 2019 at 07:40, on Zulip):

@Albin Stjerna I think you need bors rights, I did it

lqd (Jul 13 2019 at 19:11, on Zulip):

liveness landed @Albin Stjerna :tada:

Albin Stjerna (Jul 13 2019 at 19:49, on Zulip):

I saw! :tada:

Aaron Weiss (Jul 15 2019 at 18:10, on Zulip):

It's like LaTeX

@Albin Stjerna honestly, Emacs is much more tolerable than LaTeX in that the former at least mostly works.

lqd (Oct 10 2019 at 00:03, on Zulip):

I haven't kept up with the liveness work as much as I should have, so maybe this is normal, but in the polonius-imprecision/cycle_unification test case, I feel liveness has changed from when it was computed by rustc: it seems to start one point earlier, eg at the mid point of an assignment and not at the start point of the successor location (which is how it used to be IIRC); is that something you'd expect ?

lqd (Oct 10 2019 at 00:34, on Zulip):

ah but I think it's actually been fixed by Matthew's PR

Albin Stjerna (Oct 10 2019 at 06:46, on Zulip):

That is a bit weird, because I did extensive comparisons to the output of rustc and definitely had the same output

lqd (Oct 10 2019 at 08:39, on Zulip):

yeah I remember, that's why this was confusing; maybe it's subtle differences between nightlies which were misleading me (eg. the number of blocks changed recently making all my notes misnumbered :)

lqd (Oct 11 2019 at 14:46, on Zulip):

ok I now see what you mean about initialization and liveness dominating runtime in your benchmarks; eg in my latest test on clap:
- initialization: 474 ms
- liveness: 278 ms
- borrow checking: 42ms

lqd (Oct 12 2019 at 15:53, on Zulip):

alright #134 should help with your real world benchmarks (but not eg our clap ones, where code always passes the locationinsensitive pre-filter) to avoid computing init/liveness twice

Albin Stjerna (Oct 13 2019 at 06:20, on Zulip):

Yay!

lqd (Oct 13 2019 at 11:41, on Zulip):

I do believe it should be fairly easy to optimize the liveness computation by doing it on less variables — and similarly for initialization by computing var_maybe_initialized_on_exit on less paths (but I'm still fuzzy on its possible use outside of liveness, eg maybe ... initialization errors ?) and unsure about the ongoing work on that anyways, so likely TBD later; although on clap, initialization is dominating runtime as we mentioned before

lqd (Oct 13 2019 at 11:45, on Zulip):

(I'll try to prototype that for liveness, and run rustc's tests, hopefully by next week's meeting)

Albin Stjerna (Oct 13 2019 at 15:57, on Zulip):

Hm var maybe...is only needed for drop-liveness, but move path initialisation is needed for move errors

Albin Stjerna (Oct 13 2019 at 15:58, on Zulip):

That is, it is needed precisely when the variable holds a struct implementing a custom drop and that struct contains (transitively I guess) a reference

Albin Stjerna (Oct 13 2019 at 15:58, on Zulip):

Which is basically never according to my estimates

lqd (Oct 13 2019 at 20:41, on Zulip):

if what I have in mind works (we'll see with rustc's tests, as ours seem to pass) it should be helpful to reduce the number of tuples produced by quite a bit; even if the relation is almost never useful, we still have to compute it in case it is actually useful heh. but yeah we can detect and limit the work to compute it when it's not used, say. and it should have the same effect as the reduction I'm trying (and which seems to have some potential /me whistles until I can run the tests :)

lqd (Oct 14 2019 at 09:59, on Zulip):

(FWIW, ie not that much, but better than nothing: it did pass rustc's ui tests. With, on clap, 30x faster initialization -- that is completely unsuitable for computing move/initialization errors if we wanted to do that -- and 25x faster liveness, which I guess is sound IIUC, but we'll check with niko. Overall 10x on clap since it's early returned by LocationInsensitive)

Albin Stjerna (Oct 14 2019 at 11:57, on Zulip):

Cool!

Last update: Nov 15 2019 at 20:20UTC