Stream: t-compiler

Topic: end-to-end queries

Igor Matuszewski (Jan 12 2019 at 22:35, on Zulip):

@nikomatsakis figured it'd be a good idea to slowly separate a topic for this work - is that fine?

Igor Matuszewski (Jan 12 2019 at 22:40, on Zulip):

I spent some time to understand the code and slowly tried to convert more stuff on the expansion/hir/analysis boundary only to just now understand recent's @Zoxc (e.g. allocating HIR on arena, pushing more checks to incremental infra) great work :grinning_face_with_smiling_eyes:

Igor Matuszewski (Jan 12 2019 at 22:41, on Zulip):

so I decided not to touch much stuff from those PRs and instead slowly chip at Session/pass-based code elsewhere

Igor Matuszewski (Jan 12 2019 at 22:42, on Zulip):

I have a question - I decided to at first querify Session::proc_macro_decls_static which worked out okay but I found out there's a proc_macro_decls_static query that is defined for the external crates via cstore, in the librustc_metadata crate

Igor Matuszewski (Jan 12 2019 at 22:43, on Zulip):

and so I created a separate proc_macro_decls_static_local crate, which only works for LOCAL_CRATE - is there a way to combine those into a single query, where for LOCAL_CRATE key it traverses the AST like before but for other crate nums it tries to pull them from the externally loaded crates?

Igor Matuszewski (Jan 12 2019 at 22:44, on Zulip):

or is such split still beneficial from data flow visualization/analysis point of view?

Igor Matuszewski (Jan 12 2019 at 22:45, on Zulip):

for the context, this is my current diff:

Zoxc (Jan 12 2019 at 23:56, on Zulip):

You should be able to use the proc_macro_decls_static query. It will dispatch to the metadata if the crate is external, otherwise it will run the regular provider

Igor Matuszewski (Jan 13 2019 at 00:12, on Zulip):

oooh, since this exists on Providers but we can specify functions there separately either locally or via extern providers

Igor Matuszewski (Jan 13 2019 at 00:13, on Zulip):

makes sense, thank you!

nikomatsakis (Jan 16 2019 at 14:13, on Zulip):

Sorry @Igor Matuszewski just saw this message, but yeah -- "what @Zoxc said", basically =)

Zoxc (Jan 22 2019 at 15:20, on Zulip):

I now have a concrete plan on how to bring the queries I defined in into the proper query system without causing performance regressions.
By using no_hash (added in and eval_always on the queries, we can avoid the overhead of incremental compilation. We also avoid the need to introduce HashStable impls.
Creating the HIR map will be a single query, which also sets up the Krate, HirBody and Hir dep nodes. These will no longer be treated as inputs and forcing them will result in the HIR map query being run.
We can think of this as the HIR map query producting multiple return value from the perspective of incremental compilation.
Calling tcx.hir() will invoke the HIR map in an ignore context and return a reference to the HIR map. Accessing the HIR map will add the required read edges to the dep graph. Since the HIR map is accessed frequently, it will be cached in GlobalCtxt to avoid the overhead of the query system.

This won't make anything more incremental, but when this is in place we can start splitting these larger queries into smaller incremental ones.

mw (Jan 22 2019 at 16:02, on Zulip):

So hir_map would be a no_hash query and then there would be additional, regular queries for specific lookups?

mw (Jan 22 2019 at 16:02, on Zulip):

(kind of like the codegen_units query?)

Zoxc (Jan 22 2019 at 16:07, on Zulip):

@mw No. We'd just use the HIR map as is (when returned by the query which creates it), as it does it's own dependency tracking.

mw (Jan 22 2019 at 16:12, on Zulip):

Yeah, that would work too. But it would be nice to get rid of the special casing in the medium term.

mw (Jan 22 2019 at 16:13, on Zulip):

hm, although, how would that work?

mw (Jan 22 2019 at 16:14, on Zulip):

a query would do something like let hir_map = tcx.hir_map(); and then hir_map.expect_item(...) or something like that.

mw (Jan 22 2019 at 16:15, on Zulip):

but tcx.hir_map() is a no_hash query, so every query calling it would always be re-evaluated.

mw (Jan 22 2019 at 16:16, on Zulip):

or maybe I am misunderstanding something?

Zoxc (Jan 22 2019 at 16:28, on Zulip):

No query will actually call hir_map though, they'll use tcx.hir() instead, which calls hir_map in an ignore context, so no dependency to the hir_map query will be added

nikomatsakis (Jan 22 2019 at 18:36, on Zulip):

wait what :)

nikomatsakis (Jan 22 2019 at 18:37, on Zulip):

why do we need an ignore context exactly?

nikomatsakis (Jan 22 2019 at 18:37, on Zulip):

you could also imagine that the hir-map is just treated as 'always dirty'

nikomatsakis (Jan 22 2019 at 18:37, on Zulip):


nikomatsakis (Jan 22 2019 at 18:37, on Zulip):

so that it is always constructed?

nikomatsakis (Jan 22 2019 at 18:38, on Zulip):

is that the same thing?

nikomatsakis (Jan 22 2019 at 18:38, on Zulip):

(in general, I'd definitely prefer to have fewer "exceptions" where possible)

Zoxc (Jan 22 2019 at 18:38, on Zulip):

If there's no change to the crate, the HIR map will be green, but still executed since we don't save the result

nikomatsakis (Jan 22 2019 at 18:38, on Zulip):

I'm not entirely convinced the HIR map makes a ton of sense, but it seems like if we move the HIR map methods to queries, we can avoid having any exceptions at all, no? (except maybe a "volatile" flag on the base map, purely for efficiency and/or to avoid adding impls / serialization?) maybe it results in some overhead, that is the fear?

Zoxc (Jan 22 2019 at 18:41, on Zulip):

We need an ignore context since any change to the crate will result in the HIR map query dep node being red. Any queries which has a dependency on it will need to be executed again.

nikomatsakis (Jan 22 2019 at 18:55, on Zulip):

that seems like it's sort of "by design"

nikomatsakis (Jan 22 2019 at 18:55, on Zulip):

that is, most things should not directly interact with the HIR map

nikomatsakis (Jan 22 2019 at 18:55, on Zulip):

but rather queries on the HIR map

nikomatsakis (Jan 22 2019 at 18:55, on Zulip):

but ok I see, you are aiming not to modify the code that is doing tcx.hir().foo()

nikomatsakis (Jan 22 2019 at 18:56, on Zulip):

I feel like I'd rather that the HIR map as an entity goes away, at least eventually, and we just move to

Zoxc (Jan 22 2019 at 18:57, on Zulip):

I am aiming to not modify existing code and also avoid the performance overhead of adding queries which just forward information from the HIR map.

nikomatsakis (Jan 22 2019 at 20:37, on Zulip):

it feels like those queries potentially play an important role, though, in terms of helping to propagate changes

nikomatsakis (Jan 22 2019 at 20:38, on Zulip):

right now, at least IIRC, we have special dep-nodes for sub-parts of the HIR map, basically, but we could sort of remove that whole concept and instead just be talking about the results of queries

nikomatsakis (Jan 22 2019 at 20:39, on Zulip):

(but cost is obviously a concern, we may indeed want some special treatment for optimization purposes -- though I feel like I'd sort of prefer if these still "feel like" queries in the end, just specially implemented ones)

mw (Jan 23 2019 at 09:47, on Zulip):

it might be workable temporary solution but I really want to avoid the hir map having to duplicate all the dep-tracking and hashing logic.

mw (Jan 23 2019 at 09:49, on Zulip):

regarding existing code: tcx.hir() could return small wrapper struct that provides the interface of the hir map but forwards to regular queries.

mw (Jan 23 2019 at 10:04, on Zulip):

I imagine it things to look kind of like this: there's an [no_hash] hir_map() query that does all the mapping for the crate. then there is a regular item_hir_map(DefIndex) that returns a map from LocalId to the actual HIR. and then there's the mentioned wrapper behind tcx.hir() that keeps providing the existing interface. Something like that. There might have to be a separate query for bodies?

Igor Matuszewski (Jan 25 2019 at 17:05, on Zulip):

@nikomatsakis I've been thinking about Lark's "fast" parse that was described in the video and was wondering how we might apply the laziness in our case and how expansion comes into play

Igor Matuszewski (Jan 25 2019 at 17:06, on Zulip):

I imagine we could do a fast parse without doing macro expansion to get a guaranteed, but incomplete set of 'definitions' per given file but when we get to a name resolution on a definition not on that list (meaning it must've been produced during expansion) we might then try to macro-expand a given file and cache any intermediate results?

Igor Matuszewski (Jan 25 2019 at 17:07, on Zulip):

(I hope this topic is good for general idea/discussion and not specifically the existing query system details :grinning: )

nikomatsakis (Jan 25 2019 at 19:06, on Zulip):

@Igor Matuszewski applying to Rust definitely seems harder in some ways. Something I hadn't fully appreciated before is how having had an "IDE-focused compiler" earlier on might have affected the language design in various ways -- closures, impls, and nested items are all examples of things I might have done differently.

nikomatsakis (Jan 25 2019 at 20:15, on Zulip):

@mw @Zoxc so I was thinking more about this. I agree that as a temporary measure, what @Zoxc proposed makes sense, but for a longer term plan I definitely think we should not be doing anything "fancy" but just leveraging the incremental system.

I am not convinced we want a hir_map query at all longer term. We should see if we can make HIR lowering more "targeted" -- e.g., I think you can get the HIR for a struct without having to also create the HIR for its sibling items.

mw (Jan 28 2019 at 14:28, on Zulip):

I guess that we always have to lower an entire "hir owner" at a time so that we can properly assign LocalIds?

mw (Jan 28 2019 at 15:30, on Zulip):

I would be OK with doing @Zoxc's proposal as an intermediate step, but I'd really want to get rid of the manual dep-tracking for the HIR soon thereafter.

mw (Jan 28 2019 at 15:31, on Zulip):

I'd prefer to even skip this intermediate step but that might mean maintaining a branch for the refactoring for quite a while.

Zoxc (Jan 28 2019 at 15:34, on Zulip):

I think we should remove NodeIds before trying to split the HIR map up. We also need some way to deal with non-item local things like impl Trait.

Igor Matuszewski (Jan 28 2019 at 16:43, on Zulip):

Agreed, it’d be good to move over the finish line and merge

nikomatsakis (Jan 28 2019 at 20:48, on Zulip):

removing NodeId and other such refactorings makes sense

Zoxc (Apr 11 2019 at 15:21, on Zulip):

I found yet another reason to want my end-to-end queries PR. I should make a to do list of things blocking on it

nikomatsakis (Apr 12 2019 at 17:40, on Zulip):

@Zoxc actually, that would be a good idea :) (a list of things blocked etc)

Last update: May 24 2020 at 21:20UTC