Stream: t-compiler/wg-polonius

Topic: Albin's thesis and presentation(s?)


Albin Stjerna (Jun 01 2019 at 08:36, on Zulip):

@nikomatsakis Ok, so I have a question of formalism. We have this (* reads off of back of hand *) subtyping rule . But our analysis takes places into account. How do we formulate that? Should that be part of the rule as well? Also, I know the types should probably be some sort of generic parameters as well.

Albin Stjerna (Jun 01 2019 at 11:15, on Zulip):

Also, the analysis is both contest-sensitive (which I take to mean "respects function boundaries") and flow-sensitive (which I take to mean "analyses per program point"), right? I get my definitions from http://doi.acm.org/10.1145/1480881.1480911

nikomatsakis (Jun 01 2019 at 11:32, on Zulip):

I don't believe it is context sensitive in the usual sense, because it is intra-procedural (not inter-procedural).

nikomatsakis (Jun 01 2019 at 11:33, on Zulip):

That said, formulating Polonius as a type system is a bit tricky. You may want to check out the work that @Aaron Weiss has been doing for some inspiration in what that might look like.

nikomatsakis (Jun 01 2019 at 11:34, on Zulip):

(I'll fish out a link at some point, or perhaps someone else will supply it)

nikomatsakis (Jun 01 2019 at 11:34, on Zulip):

Certainly Polonius is flow-sensitive

nikomatsakis (Jun 01 2019 at 11:34, on Zulip):

I don't believe it is context sensitive in the usual sense, because it is intra-procedural (not inter-procedural).

I guess I think this is more of a N/A than anything else. The user kind of gives us, in the form of function signatures, the "summary" of the effects of our callees.

nikomatsakis (Jun 01 2019 at 11:35, on Zulip):

That said, formulating Polonius as a type system is a bit tricky. You may want to check out the work that Aaron Weiss has been doing for some inspiration in what that might look like.

the basic idea though is that you have an environment that is changing as you move through the control flow

nikomatsakis (Jun 01 2019 at 11:35, on Zulip):

your subtyping rule had no environment at all, that doesn't work

nikomatsakis (Jun 01 2019 at 11:36, on Zulip):

I feel like if this is going to be part of your thesis it's...probably out of scope

nikomatsakis (Jun 01 2019 at 11:36, on Zulip):

That is, I would probably try to formulate polonius more in terms of a static analysis than a type system

Aaron Weiss (Jun 03 2019 at 18:19, on Zulip):

Answering @nikomatsakis's call for a link: the latest version of the paper is available on my website, but there's a bunch of ongoing changes that have started in the (unreleased and not-very-human-readable) tech report first. So, if there's not an urgent need to look at this stuff, there ought to be a new version on my website (and arXiv) in a little over a month.

Albin Stjerna (Jun 03 2019 at 20:50, on Zulip):

@Aaron Weiss Thank you! I have skimmed the draft of your Oxide paper, in fact it's one of the first ones I read for my background section, but I haven't nearly digested it yet. The type theory is far above my head, but I think explaining that is one of the things my subject reviewer is good for :)

Aaron Weiss (Jun 03 2019 at 22:47, on Zulip):

@Albin Stjerna if you have any questions that they can’t answer (or that you’d rather ask me anyway), I’m happy to help!

Albin Stjerna (Jun 04 2019 at 09:44, on Zulip):

@Aaron Weiss Thank you! It's not urgent; the deadline for my thesis is roughly August or so, give or take 20 days, and nothing really depends on the theoretical parts, so it's not blocking anything. While you are here though, do you have any suggestions for introductory reading on type theory? They didn't do a great job of that in my programming language semantics course, and I don't really know where to start

centril (Jun 04 2019 at 09:59, on Zulip):

@Albin Stjerna https://www.cis.upenn.edu/~bcpierce/tapl/ is a good book if you haven't read it

Albin Stjerna (Jun 04 2019 at 10:05, on Zulip):

@centril Thank you! I haven't

centril (Jun 04 2019 at 10:06, on Zulip):

@Albin Stjerna http://www.cse.chalmers.se/edu/year/2018/course/DAT350/ might also be a good resource

Aaron Weiss (Jun 04 2019 at 13:42, on Zulip):

Yeah, TAPL is the classic recommendation. Though, you may also like Bob Harper’s Principles and Foundations of Programming Languages (PFPL) which I’ve read... some of.

Aaron Weiss (Jun 04 2019 at 13:43, on Zulip):

Lectures from OPLSS on YouTube could prove useful too. Every year, they have some intro stuff on type theory, logical relations, etc.

lqd (Jun 04 2019 at 13:45, on Zulip):

(the OPLSS lectures are here)

Albin Stjerna (Jun 04 2019 at 14:46, on Zulip):

I guess I have my work cut out for me now. Thanks everyone!

Albin Stjerna (Jun 06 2019 at 08:22, on Zulip):

My (new) working title for the talk

Albin Stjerna (Jun 07 2019 at 07:32, on Zulip):

I really appreciate the timing of the reformulated subset relations right before the talk. "Hi err we have this formulation sort of but it doesn't work really and we have a new thing which probably does and might be faster but I didn't work on those so it's not my fault!"

Albin Stjerna (Jun 07 2019 at 07:32, on Zulip):

(I honestly think this is funny and I'm not actually worried; it just shows the benefits of doing the reformulation in a declarative and easily iterated-on language to begin with)

Albin Stjerna (Jun 11 2019 at 08:41, on Zulip):

I personally think this joke is the high point of my presentation

Albin Stjerna (Jun 11 2019 at 10:05, on Zulip):

@WG-polonius Hi! This is a draft of my presentation for tomorrow (animated, no animations). I'd like some feedback if you have the time, in particular on terminology and technical details of my examples to make sure I'm not completely off.

Albin Stjerna (Jun 11 2019 at 10:05, on Zulip):

(I might have to axe some things)

Albin Stjerna (Jun 11 2019 at 10:46, on Zulip):

Update: nope, I'm within my time budget if I don't go overboard explaining the datalog

Albin Stjerna (Jun 11 2019 at 10:47, on Zulip):

Also I think I might have overdone a) the Hamlet references and b) the number of animations

nikomatsakis (Jun 11 2019 at 18:08, on Zulip):

looks good to me:)

lqd (Jun 11 2019 at 18:10, on Zulip):

(same for me :)

Albin Stjerna (Jun 12 2019 at 14:07, on Zulip):

Thanks! I survived the presentation and I think it went all right. Though the person after me introduced Soufflé, Doop, and Datalog itself so the ordering could perhaps have been better.

Albin Stjerna (Jun 12 2019 at 14:08, on Zulip):

I could really have used that in my presentation in stead of just saying “this is what Datalog looks like but I don’t have time to explain the details”

Albin Stjerna (Sep 04 2019 at 14:12, on Zulip):

@Aaron Weiss I have a question about Oxide, if you have the time. Reading the type inference rules, I don't understand how the unification of reference types upon assignment (T-Assign) works. As I understand the assignment rule, it starts given a statement x = y where x, y are references, require that x and y unify as a prerequisite, and some evaluation of the RHS (which I'm not concerned with due to MIR). That makes sense to me. However, as I understand it, the type unification would merge the LHS and RHS' provenance variables into the resulting type tau-n, which I never see used again anywhere in the rule, and then update the mapping to...what? I would have expected the substitution-yielding subtyping of T-Let in stead.

Albin Stjerna (Sep 04 2019 at 14:13, on Zulip):

(I am basically trying to find the part of Oxide I need to prod to find the subtyping rule of Niko's first blog entry, in order to get to Polonius)

Aaron Weiss (Sep 04 2019 at 22:27, on Zulip):

@Albin Stjerna you're asking pretty directly about the parts that were broken before, and that have (hopefully) been fixed in the latest version of Oxide. There's still unfortunately a bunch of work to be done though before we push out an updated draft. :\

Albin Stjerna (Sep 05 2019 at 06:32, on Zulip):

@Aaron Weiss but that should be the same as in Polonius right? What I don’t understand (I think) is how a) variable liveness (and therefore loan liveness) comes into play and b) how loans ever get removed from provenance variables (corresponding to Polonius’ killed fact).

Albin Stjerna (Sep 05 2019 at 06:33, on Zulip):

What I’m trying to do is to hand-wave a bit and say “look Oxide is roughly the same as Polonius”

Albin Stjerna (Sep 05 2019 at 06:36, on Zulip):

Perhaps I should be looking at the operational semantics in stead, but I wanted to understand the typing judgements first.

Albin Stjerna (Sep 05 2019 at 09:47, on Zulip):

I can make the question somewhat simpler though; in this rule, what happens to tau-n and where does tau come from?

Aaron Weiss (Sep 05 2019 at 14:15, on Zulip):

\tau_n is ignored because it's really using unification there as a sort of type compatibility, rather than actually caring about the result. Choosing to use \tau_u in places-typ is essentially how you end up killing the loans that were present in \tau_o (\tau_n would contain loans from both \tau_u and \tau_o). The \tau is actually \overline{\tau} and comes from computing places-typ there.

Aaron Weiss (Sep 05 2019 at 14:16, on Zulip):

In actuality though, this setup was confusing, and we've since changed it to more directly reflect this kind of "overwriting" behavior (as compared to the combining behavior that is present in the other uses of unification).

Aaron Weiss (Sep 05 2019 at 14:16, on Zulip):

So, in my latest (internal) draft, the rule doesn't quite look the same.

Aaron Weiss (Sep 05 2019 at 14:16, on Zulip):

It looks like this.

Aaron Weiss (Sep 05 2019 at 14:18, on Zulip):

the \mathcal{L}^\prime - \mathcal{p} there (and the fact that the unification judgment, now written <~, uses the overwrite (-) mode) is more-or-less how we've implemented the kill rules.

Aaron Weiss (Sep 05 2019 at 14:19, on Zulip):

But like I said, there's still a bunch more work that I have to do before we actually push a whole updated thing.

Albin Stjerna (Sep 06 2019 at 09:05, on Zulip):

@Aaron Weiss I see, thank you. So what happens is that you first use unification to verify that the types of the assignment are compatible (e.g., I'm not trying to assign a tuple to an integer), and it doesn't modify the provenance variables. That makes sense to me.

Ok, so then I just don't fully understand places-typ(pi, tau-u). I get that it expands to all paths rooted in pi and would return a set, which you use to update gamma with the new mapping and kill the old loans. I don't get where the new mapping is introduced, or what it is mapping to. I guess you want it to expand to pi's old mappings before assignment in order to catch the paths you want to overwrite?

Albin Stjerna (Sep 06 2019 at 09:05, on Zulip):

(Sorry for all the questions!)

Albin Stjerna (Sep 06 2019 at 13:45, on Zulip):

@WG-polonius I now have a draft version of my thesis that I'm not completely embarrassed by and would appreciate any and all feedback on! In particular, I would appreciate feedback on technical aspects. Also, I apologize in advance for mangling the very pretty Oxide typing judgements.

Albins-Thesis-draft-version.pdf

Albin Stjerna (Sep 06 2019 at 13:46, on Zulip):

(It doesn't have the final numbers of my benchmarking re-run, but they should look like the ones in the current version)

Aaron Weiss (Sep 06 2019 at 19:27, on Zulip):

@Albin Stjerna an in-text citation hygiene thing: you should probably just write four authors last names instead of writing three authors last names and "et. al."

Aaron Weiss (Sep 06 2019 at 19:28, on Zulip):

Also, if you do want to make the typing rules a bit cleaner: you can add \, \; and \ as spacing characters that will actually be rendered in math mode.

Albin Stjerna (Sep 06 2019 at 19:33, on Zulip):

Thank you; I'll look into both of them. All named citations are from \citeauthor, but you can probably tweak how many authors it lets through

Tshepang Lekhonkhobe (Sep 06 2019 at 19:48, on Zulip):

@WG-polonius I now have a draft version of my thesis that I'm not completely embarrassed by and would appreciate any and all feedback on! In particular, I would appreciate feedback on technical aspects. Also, I apologize in advance for mangling the very pretty Oxide typing judgements.

Albins-Thesis-draft-version.pdf

I see figure 2.1 and listing 2.1 are mixed up with each other

Tshepang Lekhonkhobe (Sep 06 2019 at 19:50, on Zulip):

same with figure 2.2 and listing 2.2

Aaron Weiss (Sep 06 2019 at 20:09, on Zulip):

@Albin Stjerna I'll try to make some time to read the whole thing soon and send you some feedback if I can.

Albin Stjerna (Sep 07 2019 at 06:33, on Zulip):

Albin Stjerna I'll try to make some time to read the whole thing soon and send you some feedback if I can.

Thank you! :)

Albin Stjerna (Sep 30 2019 at 20:10, on Zulip):

Update: I presented today. I don't think my opponent, an electrical engineer who hasn't taken a compilers course and who clearly didn't spend a lot of time preparing questions (the first one was "what real-world problem does this solve?"), ever understood most of it, but people ate the cookies I made, laughed at most of the jokes, and nobody threw tomatoes so overall I think it went all right.

Tshepang Lekhonkhobe (Oct 01 2019 at 05:36, on Zulip):

what is opponent @Albin Stjerna

Albin Stjerna (Oct 01 2019 at 06:22, on Zulip):

@Tshepang Lekhonkhobe: basically another student who is responsible for reading my thesis draft, give me feedback on it and prepare questions for my presentation.

lqd (Oct 01 2019 at 16:42, on Zulip):

I've finally read the thesis, pretty cool. The stats parts was interesting to get an idea of somewhat typical inputs :thumbs_up:

lqd (Oct 14 2019 at 15:56, on Zulip):

@Albin Stjerna pg 33 you say "we also excluded all functions that had no loans at all fromthe analysis, a surprisingly large portion; almost47%. ", is this data in your 3 078 110 lines (500MB) repo-stats.csv files you once shared with me ? (I ask because I'm seeing 63%)

Albin Stjerna (Oct 14 2019 at 16:06, on Zulip):

@lqd I think I re-ran the analysis since then, but I think so

Albin Stjerna (Oct 14 2019 at 16:06, on Zulip):

I remember that the numbers changed anyway

lqd (Oct 14 2019 at 16:07, on Zulip):

is this new csv available somewhere btw ?

Albin Stjerna (Oct 14 2019 at 16:07, on Zulip):

Err, possibly

lqd (Oct 14 2019 at 16:09, on Zulip):

(as I'm also planning some day to look at the slowest / biggest functions there, probably rust-aaplus or something)

Albin Stjerna (Oct 14 2019 at 16:09, on Zulip):

How did I send it last time?

Albin Stjerna (Oct 14 2019 at 16:09, on Zulip):
$ wc -l repo-stats.csv
 3939484 repo-stats.csv
$ du -h repo-stats.csv
663M    repo-stats.csv
lqd (Oct 14 2019 at 16:11, on Zulip):

here https://rust-lang.zulipchat.com/#narrow/stream/186049-t-compiler.2Fwg-polonius/topic/performance/near/171462851

lqd (Oct 14 2019 at 16:12, on Zulip):

yeah seems like a newer one, 800K more functions or so

Albin Stjerna (Oct 14 2019 at 16:12, on Zulip):

I'm uploading it now! :)

lqd (Oct 14 2019 at 16:13, on Zulip):

awesome, thank you

lqd (Oct 14 2019 at 16:14, on Zulip):

the biggest outliers probably haven't changed since then though ?

Albin Stjerna (Oct 14 2019 at 16:14, on Zulip):

Probably not, but let's find out!

lqd (Oct 14 2019 at 16:21, on Zulip):

I had

highest `borrow_region`: 4049 - imgui-ext parser-emmit_tag_tokens
highest `cfg_edge`: 308072 - fastcmp {{impl}}-feq
highest `invalidates`: 4518016 - capstone-rs {{impl}}[528]-fmt
highest `killed`: 1790 - imgui-ext parser-emmit_tag_tokens
highest `regions`: 41175 - gleam ffi-{{impl}}[1]-load_with
highest `cfg nodes`: 21678 - fastcmp {{impl}}-feq

highest `min(3) Naive runtime`: 779.1390095889801 - rust-aaplus aaelp2000-G_ELP10
highest `min(3) Hybrid runtime`: 777.7702879499411 - rust-aaplus aaelp2000-G_ELP10
highest `min(3) DatafrogOpt runtime`: 768.0480983849848 - rust-aaplus aaelp2000-G_ELP10
Albin Stjerna (Oct 14 2019 at 16:23, on Zulip):

https://we.tl/t-cdzolRegZi

Albin Stjerna (Oct 14 2019 at 16:23, on Zulip):

There, the new one

lqd (Oct 14 2019 at 16:23, on Zulip):

thanks a bunch

lqd (Oct 14 2019 at 16:27, on Zulip):

your loans and borrow_regions columns seem ... strange ? how are you computing loans ?

lqd (Oct 14 2019 at 16:29, on Zulip):

(or maybe the columns changed between the two, I'll check that haha)

lqd (Oct 14 2019 at 16:29, on Zulip):

ah yes ofc it was that, sneaky :)

lqd (Oct 14 2019 at 16:38, on Zulip):

(still looks like 60+% without loans, most likely const/statics/promoted)

lqd (Oct 14 2019 at 16:38, on Zulip):

slowest seem to be in rust-unic now though, only 245

lqd (Oct 14 2019 at 16:39, on Zulip):

aaplus must have OOM-ed :)

lqd (Oct 14 2019 at 16:43, on Zulip):

loving the highest universal_region: 125 - gll parse_grammar-{{impl}}[49]-all-{{closure}}[1] auto generated code

lqd (Oct 14 2019 at 16:49, on Zulip):

/me blinks at highest cfg_edge: 2152391 - zip_codes ZIP_CODES

Albin Stjerna (Oct 14 2019 at 19:10, on Zulip):

Yes there are a few...odd ones there

Albin Stjerna (Oct 18 2019 at 10:21, on Zulip):

Ok, so as I mentioned, Tobias wasn't happy with my thesis. The sunny side of this is that one of the things he came down on really hard was my use of Stupid Jargon (my words), and with some luck one of the things I will produce is very concrete examples for when facts are emitted in a (hopefully) human-readable format. Which is, I guess, great for Polonius documentation, and also makes it worth putting some effort into

Albin Stjerna (Oct 18 2019 at 10:25, on Zulip):

I also have a question about MIR that's been bugging me. So it says in the RFC that MIR is "static single assignment except for borrow expressions", but as far as I can tell it's not at all in SSA form. Assigning to the same variable multiple times does not introduce more temporaries?

Albin Stjerna (Oct 29 2019 at 16:25, on Zulip):

As part of clearing up my thesis, I have made a table of the facts I currently use, with examples. Does anyone see anything wrong with it? pasted image

Albin Stjerna (Oct 29 2019 at 18:35, on Zulip):

I am having some trouble keeping them apart, but you mean for pedagogical reasons?

Albin Stjerna (Oct 29 2019 at 18:35, on Zulip):

Honestly I realised I didn't fully understand killed, and had to try some different versions of the same code to get a feeling for it

Albin Stjerna (Oct 30 2019 at 14:13, on Zulip):

Ok, I think I understand. However, I don't really understand what loan would be killed in this case, the one of xor the one of y?:

let r = &mut x;
if random() {
    r = &mut y;
}
r = &mut z;
Matthew Jasper (Oct 30 2019 at 19:37, on Zulip):

The case where killed facts actually matter is something like

let mut x: T;
let r: &mut T;
let s = &mut *r;
r = &mut x;          // Kills the loan on the line above.
Matthew Jasper (Oct 30 2019 at 19:39, on Zulip):

They're generated in a lot of other cases as well, but usually if there's a live loan that's killed it's also invalidated, which results in an error.

Albin Stjerna (Nov 01 2019 at 09:44, on Zulip):

Ok, so it basically covers some cases but not all. I see!

Albin Stjerna (Nov 01 2019 at 09:44, on Zulip):

(I have precisely that example in my thesis)

Albin Stjerna (Nov 01 2019 at 09:44, on Zulip):

Thank you!

Albin Stjerna (Nov 01 2019 at 12:15, on Zulip):

Me: "Oh no I'm really behind on the work on my thesis"
Also me: pasted image

Last update: Nov 15 2019 at 20:00UTC