Stream: t-compiler/wg-learning

Topic: macros discussion


mark-i-m (Jul 09 2019 at 19:45, on Zulip):

@Vadim Petrochenkov Hi :wave:
I was wondering if you would have a chance sometime in the next month or so to just have a zulip discussion where you tell us (WG-learning) everything you know about macros/expansion/hygiene. We were thinking this could be less formal (and less work for you) than compiler lecture series lecture... thoughts?

mark-i-m (Jul 09 2019 at 19:45, on Zulip):

The goal is to fill out that long-standing gap in the rustc-guide

Vadim Petrochenkov (Jul 09 2019 at 20:13, on Zulip):

Ok, I'm at UTC+03:00 and generally available in the evenings (or weekends).

mark-i-m (Jul 09 2019 at 21:43, on Zulip):

@Vadim Petrochenkov Either of those works for me (your evenings are about lunch time for me :) ) Is there a particular date that would work best for you?

mark-i-m (Jul 09 2019 at 21:44, on Zulip):

@WG-learning Does anyone else have a preferred date?

Vadim Petrochenkov (Jul 09 2019 at 22:06, on Zulip):

Is there a particular date that would work best for you?

Nah, not much difference.
(If something changes for a specific day, I'll notify.)

Santiago Pastorino (Jul 09 2019 at 23:28, on Zulip):

week days are better, but I'd say let's wait for @Vadim Petrochenkov to say when they are ready for it and we can set a date

Santiago Pastorino (Jul 09 2019 at 23:29, on Zulip):

also, we should record this so ... I guess it doesn't matter that much when :)

mark-i-m (Jul 10 2019 at 20:22, on Zulip):

also, we should record this so ... I guess it doesn't matter that much when :)

@Santiago Pastorino My thinking was to just use zulip, so we would have the log

mark-i-m (Jul 10 2019 at 20:25, on Zulip):

@Vadim Petrochenkov @WG-learning How about 2 weeks from now: July 24 at 5pm UTC time (if I did the math right, that should be evening for Vadim)

Amanjeev Sethi (Jul 10 2019 at 20:26, on Zulip):

i can try and do this but I am starting a new job that week so cannot promise.

Santiago Pastorino (Jul 10 2019 at 21:37, on Zulip):

Vadim Petrochenkov @WG-learning How about 2 weeks from now: July 24 at 5pm UTC time (if I did the math right, that should be evening for Vadim)

works perfect for me

Santiago Pastorino (Jul 10 2019 at 21:37, on Zulip):

@mark-i-m I have access to the compiler calendar so I can add something there

Santiago Pastorino (Jul 10 2019 at 21:38, on Zulip):

let me know if you want to add an event to the calendar, I can do that

Santiago Pastorino (Jul 10 2019 at 21:38, on Zulip):

how long it would be?

mark-i-m (Jul 11 2019 at 01:11, on Zulip):

let me know if you want to add an event to the calendar, I can do that

mark-i-m (Jul 11 2019 at 01:11, on Zulip):

That could be good :+1:

mark-i-m (Jul 11 2019 at 01:12, on Zulip):

how long it would be?

Let's start with 30 minutes, and if we need to schedule another we cna

Vadim Petrochenkov (Jul 14 2019 at 23:16, on Zulip):

5pm UTC

1-2 hours later would be better, 5pm UTC is not evening enough.

Vadim Petrochenkov (Jul 14 2019 at 23:17, on Zulip):

How exactly do you plan the meeting to go (aka how much do I need to prepare)?

Santiago Pastorino (Jul 15 2019 at 13:52, on Zulip):

5pm UTC

1-2 hours later would be better, 5pm UTC is not evening enough.

Scheduled for 7pm UTC then

Santiago Pastorino (Jul 15 2019 at 13:53, on Zulip):

How exactly do you plan the meeting to go (aka how much do I need to prepare)?

/cc @mark-i-m

mark-i-m (Jul 15 2019 at 15:07, on Zulip):

@Vadim Petrochenkov

How exactly do you plan the meeting to go (aka how much do I need to prepare)?

My hope was that this could be less formal than for a compiler lecture series, but it would be nice if you could have in your mind a tour of the design and the code

That is, imagine that a new person was joining the compiler team and needed to get up to speed about macros/expansion/hygiene. What would you tell such a person?

mark-i-m (Jul 23 2019 at 16:33, on Zulip):

@Vadim Petrochenkov Are we still on for tomorrow at 7pm UTC?

Vadim Petrochenkov (Jul 23 2019 at 17:13, on Zulip):

Yes.

Santiago Pastorino (Jul 23 2019 at 18:09, on Zulip):

@Vadim Petrochenkov @mark-i-m I've added an event on rust compiler team calendar

mark-i-m (Jul 24 2019 at 18:53, on Zulip):

@WG-learning @Vadim Petrochenkov Hello!

mark-i-m (Jul 24 2019 at 18:53, on Zulip):

We will be starting in ~7 minutes

mark-i-m (Jul 24 2019 at 18:55, on Zulip):

:wave:

Vadim Petrochenkov (Jul 24 2019 at 18:58, on Zulip):

I'm here.

mark-i-m (Jul 24 2019 at 19:00, on Zulip):

Cool :)

Santiago Pastorino (Jul 24 2019 at 19:00, on Zulip):

hello @Vadim Petrochenkov

mark-i-m (Jul 24 2019 at 19:00, on Zulip):

Shall we start?

mark-i-m (Jul 24 2019 at 19:00, on Zulip):

First off, @Vadim Petrochenkov Thanks for doing this!

Vadim Petrochenkov (Jul 24 2019 at 19:01, on Zulip):

Here's some preliminary data I prepared.

Vadim Petrochenkov (Jul 24 2019 at 19:01, on Zulip):

Below I'll assume #62771 and #62086 has landed.

Vadim Petrochenkov (Jul 24 2019 at 19:01, on Zulip):

Where to find the code:
libsyntax_pos/hygiene.rs - structures related to hygiene and expansion that are kept in global data (can be accessed from any Ident without any context)
libsyntax_pos/lib.rs - some secondary methods like macro backtrace using primary methods from hygiene.rs
libsyntax_ext - implementations of built-in macros (including macro attributes and derives) and some other early code generation facilities like injection of standard library imports or generation of test harness.
libsyntax/config.rs - implementation of cfg/cfg_attr (they treated specially from other macros), should probably be moved into libsyntax/ext.
libsyntax/tokenstream.rs + libsyntax/parse/token.rs - structures for compiler-side tokens, token trees, and token streams.
libsyntax/ext - various expansion-related stuff
libsyntax/ext/base.rs - basic structures used by expansion
libsyntax/ext/expand.rs - some expansion structures and the bulk of expansion infrastructure code - collecting macro invocations, calling into resolve for them, calling their expanding functions, and integrating the results back into AST
libsyntax/ext/placeholder.rs - the part of expand.rs responsible for "integrating the results back into AST" basicallly, "placeholder" is a temporary AST node replaced with macro expansion result nodes
libsyntax/ext/builer.rs - helper functions for building AST for built-in macros in libsyntax_ext (and user-defined syntactic plugins previously), can probably be moved into libsyntax_ext these days
libsyntax/ext/proc_macro.rs + libsyntax/ext/proc_macro_server.rs - interfaces between the compiler and the stable proc_macro library, converting tokens and token streams between the two representations and sending them through C ABI
libsyntax/ext/tt - implementation of macro_rules, turns macro_rules DSL into something with signature Fn(TokenStream) -> TokenStream that can eat and produce tokens, @mark-i-m knows more about this
librustc_resolve/macros.rs - resolving macro paths, validating those resolutions, reporting various "not found"/"found, but it's unstable"/"expected x, found y" errors
librustc/hir/map/def_collector.rs + librustc_resolve/build_reduced_graph.rs - integrate an AST fragment freshly expanded from a macro into various parent/child structures like module hierarchy or "definition paths"

Primary structures:
HygieneData - global piece of data containing hygiene and expansion info that can be accessed from any Ident without any context
ExpnId - ID of a macro call or desugaring (and also expansion of that call/desugaring, depending on context)
ExpnInfo/InternalExpnData - a subset of properties from both macro definition and macro call available through global data
SyntaxContext - ID of a chain of nested macro definitions (identified by ExpnIds)
SyntaxContextData - data associated with the given SyntaxContext, mostly a cache for results of filtering that chain in different ways
Span - a code location + SyntaxContext
Ident - interned string (Symbol) + Span, i.e. a string with attached hygiene data
TokenStream - a collection of TokenTrees
TokenTree - a token (punctuation, identifier, or literal) or a delimited group (anything inside ()/[]/{})
SyntaxExtension - a lowered macro representation, contains its expander function transforming a tokenstream or AST into tokenstream or AST + some additional data like stability, or a list of unstable features allowed inside the macro.
SyntaxExtensionKind - expander functions may have several different signatures (take one token stream, or two, or a piece of AST, etc), this is an enum that lists them
ProcMacro/TTMacroExpander/AttrProcMacro/MultiItemModifier - traits representing the expander signatures (TODO: change and rename the signatures into something more consistent)
trait Resolver - a trait used to break crate dependencies (so resolver services can be used in libsyntax, despite librustc_resolve and pretty much everything else depending on libsyntax)
ExtCtxt/ExpansionData - various intermediate data kept and used by expansion infra in the process of its work
AstFragment - a piece of AST that can be produced by a macro (may include multiple homogeneous AST nodes, like e.g. a list of items)
Annotatable - a piece of AST that can be an attribute target, almost same thing as AstFragment except for types and patterns that can be produced by macros but cannot be annotated with attributes (TODO: Merge into AstFragment)
trait MacResult - a "polymorphic" AST fragment, something that can turn into a different AstFragment depending on its context (aka AstFragmentKind - item, or expression, or pattern etc.)
Invocation/InvocationKind - a structure describing a macro call, these structures are collected by the expansion infra (InvocationCollector), queued, resolved, expanded when resolved, etc.

Primary algorithms / actions:
TODO

mark-i-m (Jul 24 2019 at 19:04, on Zulip):

Very useful :+1:

mark-i-m (Jul 24 2019 at 19:07, on Zulip):

@Vadim Petrochenkov Zulip doesn't have an indication of typing, so I'm not sure if you are waiting for me or not

Vadim Petrochenkov (Jul 24 2019 at 19:07, on Zulip):

The TODO part should be about how a crate transitions from the state "macros exist as written in source" to "all macros are expanded", but I didn't write it yet.

Vadim Petrochenkov (Jul 24 2019 at 19:08, on Zulip):

(That should probably better happen off-line.)

Vadim Petrochenkov (Jul 24 2019 at 19:08, on Zulip):

Now, if you have any questions?

mark-i-m (Jul 24 2019 at 19:09, on Zulip):

Thanks :)

mark-i-m (Jul 24 2019 at 19:12, on Zulip):

/me is still reading :P

mark-i-m (Jul 24 2019 at 19:14, on Zulip):

Ok

mark-i-m (Jul 24 2019 at 19:15, on Zulip):

So I guess my first question is about hygiene, since that remains the most mysterious to me... My understanding is that the parser outputs AST nodes, where each node has a Span

mark-i-m (Jul 24 2019 at 19:16, on Zulip):

In the absence of macros and desugaring, what does the syntax context of an AST node look like?

mark-i-m (Jul 24 2019 at 19:16, on Zulip):

@Vadim Petrochenkov

Vadim Petrochenkov (Jul 24 2019 at 19:17, on Zulip):

Not each node, but many of them.
When a node is not macro-expanded, its context is 0.

Vadim Petrochenkov (Jul 24 2019 at 19:17, on Zulip):

aka SyntaxContext::empty()

Vadim Petrochenkov (Jul 24 2019 at 19:18, on Zulip):

it's a chain that consists of one expansion - expansion 0 aka ExpnId::root.

mark-i-m (Jul 24 2019 at 19:18, on Zulip):

Do all expansions start at root?

Vadim Petrochenkov (Jul 24 2019 at 19:18, on Zulip):

Also, SyntaxContext:empty() is its own father.

mark-i-m (Jul 24 2019 at 19:19, on Zulip):

Is this actually stored somewhere or is it a logical value?

Vadim Petrochenkov (Jul 24 2019 at 19:19, on Zulip):

All expansion hyerarchies (there are several of them) start at ExpnId::root.

Vadim Petrochenkov (Jul 24 2019 at 19:19, on Zulip):

Vectors in HygieneData has entries for both ctxt == 0 and expn_id == 0.

Vadim Petrochenkov (Jul 24 2019 at 19:21, on Zulip):

I don't think anyone looks into them much though.

mark-i-m (Jul 24 2019 at 19:21, on Zulip):

Ok

Vadim Petrochenkov (Jul 24 2019 at 19:21, on Zulip):

Speaking of multiple hierarchies...

mark-i-m (Jul 24 2019 at 19:22, on Zulip):

Go ahead :)

Vadim Petrochenkov (Jul 24 2019 at 19:23, on Zulip):

One is parent (expn_id1) -> parent(expn_id2) -> ...

Vadim Petrochenkov (Jul 24 2019 at 19:23, on Zulip):

This is the order in which macros are expanded.

Vadim Petrochenkov (Jul 24 2019 at 19:24, on Zulip):

Well.

Vadim Petrochenkov (Jul 24 2019 at 19:24, on Zulip):

When we are expanding one macro another macro is revealed in its output.

Vadim Petrochenkov (Jul 24 2019 at 19:24, on Zulip):

That's the parent-child relation in this hierarchy.

Vadim Petrochenkov (Jul 24 2019 at 19:25, on Zulip):

InternalExpnData::parent is the child->parent link.

mark-i-m (Jul 24 2019 at 19:25, on Zulip):

So in the above chain expn_id1 is the child?

Vadim Petrochenkov (Jul 24 2019 at 19:26, on Zulip):

Yes.

Vadim Petrochenkov (Jul 24 2019 at 19:26, on Zulip):

The second one is parent (SyntaxContext1) -> parent(SyntaxContext2) -> ...

Vadim Petrochenkov (Jul 24 2019 at 19:26, on Zulip):

This is about nested macro definitions.
When we are expanding one macro another macro definition is revealed in its output.

Vadim Petrochenkov (Jul 24 2019 at 19:27, on Zulip):

SyntaxContextData::parent is the child->parent link here.

Vadim Petrochenkov (Jul 24 2019 at 19:28, on Zulip):

So, SyntaxContext is the whole chain in this hierarchy, and outer_expns are individual elements in the chain.

mark-i-m (Jul 24 2019 at 19:30, on Zulip):

So for example, suppose I have the following:

macro_rules! foo { () => { println!(); } }

fn main() { foo!(); }

Then AST nodes that are finally generated would have parent(expn_id_println) -> parent(expn_id_foo), right?

Vadim Petrochenkov (Jul 24 2019 at 19:30, on Zulip):

Pretty common construction (at least it was, before refactorings) is SyntaxContext::empty().apply_mark(expn_id), which means...

Vadim Petrochenkov (Jul 24 2019 at 19:30, on Zulip):

Then AST nodes that are finally generated would have parent(expn_id_println) -> parent(expn_id_foo), right?

Yes.

mark-i-m (Jul 24 2019 at 19:31, on Zulip):

and outer_expns are individual elements in the chain.

Sorry, what is outer_expns?

Vadim Petrochenkov (Jul 24 2019 at 19:31, on Zulip):

SyntaxContextData::outer_expn

mark-i-m (Jul 24 2019 at 19:31, on Zulip):

Thanks :) Please continue

Vadim Petrochenkov (Jul 24 2019 at 19:32, on Zulip):

...which means a token produced by a built-in macro (which is defined in the root effectively).

mark-i-m (Jul 24 2019 at 19:33, on Zulip):

Where does the expn_id come from?

Vadim Petrochenkov (Jul 24 2019 at 19:33, on Zulip):

Or a stable proc macro, which are always considered to be defined in the root because they are always cross-crate, and we don't have the cross-crate hygiene implemented, ha-ha.

Vadim Petrochenkov (Jul 24 2019 at 19:33, on Zulip):

Where does the expn_id come from?

Vadim Petrochenkov (Jul 24 2019 at 19:34, on Zulip):

ID of the built-in macro call like line!().

Vadim Petrochenkov (Jul 24 2019 at 19:34, on Zulip):

Assigned continuously from 0 to N as soon as we discover new macro calls.

mark-i-m (Jul 24 2019 at 19:36, on Zulip):

Sorry, I didn't quite understand. Do you mean that only built-in macros receive continuous IDs?

Vadim Petrochenkov (Jul 24 2019 at 19:36, on Zulip):

So, the second hierarchy has a catch - the context transplantation hack - https://github.com/rust-lang/rust/pull/51762#issuecomment-401400732.

Vadim Petrochenkov (Jul 24 2019 at 19:37, on Zulip):

Do you mean that only built-in macros receive continuous IDs?

Vadim Petrochenkov (Jul 24 2019 at 19:37, on Zulip):

No, all macro calls receive ID.

Vadim Petrochenkov (Jul 24 2019 at 19:37, on Zulip):

Built-ins have the typical pattern SyntaxContext::empty().apply_mark(expn_id) for syntax contexts produced by them.

mark-i-m (Jul 24 2019 at 19:41, on Zulip):

I see, but this pattern is only used for built-ins, right?

Vadim Petrochenkov (Jul 24 2019 at 19:43, on Zulip):

And also all stable proc macros, see the comments above.

mark-i-m (Jul 24 2019 at 19:43, on Zulip):

Got it

Vadim Petrochenkov (Jul 24 2019 at 19:44, on Zulip):

The third hierarchy is call-site hierarchy.

Vadim Petrochenkov (Jul 24 2019 at 19:45, on Zulip):

If foo!(bar!(ident)) expands into ident

Vadim Petrochenkov (Jul 24 2019 at 19:45, on Zulip):

then hierarchy 1 is root -> foo -> bar -> ident

Vadim Petrochenkov (Jul 24 2019 at 19:46, on Zulip):

but hierarchy 3 is root -> ident

Vadim Petrochenkov (Jul 24 2019 at 19:47, on Zulip):

ExpnInfo::call_site is the child-parent link in this case.

mark-i-m (Jul 24 2019 at 19:49, on Zulip):

When we expand, do we expand foo first or bar? Why is there a hierarchy 1 here? Is that foo expands first and it expands to something that contains bar!(ident)?

Vadim Petrochenkov (Jul 24 2019 at 19:50, on Zulip):

Ah, yes, let's assume both foo and bar are identity macros.

Vadim Petrochenkov (Jul 24 2019 at 19:50, on Zulip):

Then foo!(bar!(ident)) -> expand -> bar!(ident) -> expand -> ident

Vadim Petrochenkov (Jul 24 2019 at 19:51, on Zulip):

If bar were expanded first, that would be eager expansion - https://github.com/rust-lang/rfcs/pull/2320.

mark-i-m (Jul 24 2019 at 19:52, on Zulip):

And after we expand only foo! presumably whatever intermediate state has heirarchy 1 of root->foo->(bar_ident), right?

Vadim Petrochenkov (Jul 24 2019 at 19:52, on Zulip):

(We have it hacked into some built-in macros, but not generally.)

Vadim Petrochenkov (Jul 24 2019 at 19:52, on Zulip):

And after we expand only foo! presumably whatever intermediate state has heirarchy 1 of root->foo->(bar_ident), right?

Vadim Petrochenkov (Jul 24 2019 at 19:53, on Zulip):

Yes.

mark-i-m (Jul 24 2019 at 19:53, on Zulip):

Got it :)

mark-i-m (Jul 24 2019 at 19:56, on Zulip):

It looks like we have ~5 minutes left. This has been very helpful already, but I also have more questions. Shall we try to schedule another meeting in the future?

Vadim Petrochenkov (Jul 24 2019 at 19:57, on Zulip):

Sure, why not.

Vadim Petrochenkov (Jul 24 2019 at 19:58, on Zulip):

A thread for offline questions-answers would be good too.

mark-i-m (Jul 24 2019 at 20:01, on Zulip):

A thread for offline questions-answers would be good too.

I don't mind using this thread, since it already has a lot of info in it. We also plan to summarize the info from this thread into the rustc-guide.

Sure, why not.

Unfortunately, I'm unavailable for a few weeks. Would August 21-ish work for you (and @WG-learning )?

mark-i-m (Jul 24 2019 at 20:01, on Zulip):

@Vadim Petrochenkov Thanks very much for your time and knowledge!

mark-i-m (Jul 24 2019 at 20:02, on Zulip):

One last question: are there more hierarchies?

Vadim Petrochenkov (Jul 24 2019 at 20:03, on Zulip):

Not that I know of.
Three + the context transplantation hack is already more complex than I'd like.

mark-i-m (Jul 24 2019 at 20:04, on Zulip):

Yes, one wonders what it would be like if one also had to think about eager expansion...

Santiago Pastorino (Jul 24 2019 at 20:09, on Zulip):

sorry but I couldn't follow that much today, will read it when I have some time later

Santiago Pastorino (Jul 24 2019 at 20:09, on Zulip):

btw https://github.com/rust-lang/rustc-guide/issues/398

mark-i-m (Aug 06 2019 at 19:52, on Zulip):

@Vadim Petrochenkov Would 7pm UTC on August 21 work for a followup?

Vadim Petrochenkov (Aug 06 2019 at 20:21, on Zulip):

Tentatively yes.

mark-i-m (Aug 16 2019 at 19:50, on Zulip):

@Vadim Petrochenkov @WG-learning Does this still work for everyone?

Vadim Petrochenkov (Aug 17 2019 at 08:25, on Zulip):

August 21 is still ok.

mark-i-m (Aug 21 2019 at 18:30, on Zulip):

@WG-learning @Vadim Petrochenkov We will start in ~30min

Vadim Petrochenkov (Aug 21 2019 at 18:40, on Zulip):

Oh.
Thanks for the reminder, I forgot about this entirely.

mark-i-m (Aug 21 2019 at 19:00, on Zulip):

Hello!

Vadim Petrochenkov (Aug 21 2019 at 19:00, on Zulip):

(I'll be here in a couple of minutes.)

Vadim Petrochenkov (Aug 21 2019 at 19:06, on Zulip):

Ok, I'm here.

mark-i-m (Aug 21 2019 at 19:06, on Zulip):

Hi :)

Vadim Petrochenkov (Aug 21 2019 at 19:06, on Zulip):

Hi.

mark-i-m (Aug 21 2019 at 19:06, on Zulip):

so last time, we talked about the 3 context heirarchies

Vadim Petrochenkov (Aug 21 2019 at 19:06, on Zulip):

Right.

mark-i-m (Aug 21 2019 at 19:08, on Zulip):

Was there anything you wanted to add to that? If not, I think it would be good to get a big-picture... Given some piece of rust code, how do we get to the point where things are expanded and hygiene context is computed?

mark-i-m (Aug 21 2019 at 19:09, on Zulip):

(I'm assuming that hygiene info is computed as we expand stuff, since I don't think you can discover it beforehand)

Vadim Petrochenkov (Aug 21 2019 at 19:09, on Zulip):

Ok, let's move from hygiene to expansion.

Vadim Petrochenkov (Aug 21 2019 at 19:10, on Zulip):

Especially given that I don't remember the specific hygiene algorithms like adjust in detail.

Vadim Petrochenkov (Aug 21 2019 at 19:11, on Zulip):

Given some piece of rust code, how do we get to the point where things are expanded

So, first of all, the "some piece of rust code" is the whole crate.

mark-i-m (Aug 21 2019 at 19:11, on Zulip):

Just to confirm, the algorithms are well-encapsulated, right? Like a function or a struct as opposed to a bunch of conventions distributed across the codebase?

Vadim Petrochenkov (Aug 21 2019 at 19:11, on Zulip):

We run fully_expand_fragment in it.

Vadim Petrochenkov (Aug 21 2019 at 19:12, on Zulip):

Just to confirm, the algorithms are well-encapsulated, right?

Yes, the algorithmic parts are entirely inside hygiene.rs.

Vadim Petrochenkov (Aug 21 2019 at 19:13, on Zulip):

Ok, some are in fn resolve_crate_root, but those are hacks.

Vadim Petrochenkov (Aug 21 2019 at 19:14, on Zulip):

(Continuing about expansion.)
If fully_expand_fragment is run not on a whole crate, it means that we are performing eager expansion.

Vadim Petrochenkov (Aug 21 2019 at 19:15, on Zulip):

Eager expansion is done for arguments of some built-in macros that expect literals.

Vadim Petrochenkov (Aug 21 2019 at 19:15, on Zulip):

It generally performs a subset of actions performed by the non-eager expansion.

Vadim Petrochenkov (Aug 21 2019 at 19:15, on Zulip):

So, I'll talk about non-eager expansion for now.

mark-i-m (Aug 21 2019 at 19:17, on Zulip):

Eager expansion is not exposed as a language feature, right? i.e. it is not possible for me to write an eager macro?

Vadim Petrochenkov (Aug 21 2019 at 19:17, on Zulip):

https://github.com/rust-lang/rust/pull/53778#issuecomment-419224049
(vvv The link is explained below vvv )

Vadim Petrochenkov (Aug 21 2019 at 19:17, on Zulip):

Eager expansion is not exposed as a language feature, right? i.e. it is not possible for me to write an eager macro?

Yes, it's entirely an ability of some built-in macros.

Vadim Petrochenkov (Aug 21 2019 at 19:18, on Zulip):

Not exposed for general use.

Vadim Petrochenkov (Aug 21 2019 at 19:18, on Zulip):

fully_expand_fragment works in iterations.

Vadim Petrochenkov (Aug 21 2019 at 19:21, on Zulip):

Iterations looks roughly like this:
- Resolve imports in our partially built crate as much as possible.
- Collect as many macro invocations as possible from our partially built crate (fn-like, attributes, derives) from the crate and add them to the queue.

Vadim Petrochenkov (Aug 21 2019 at 19:21, on Zulip):
Vadim Petrochenkov (Aug 21 2019 at 19:22, on Zulip):
Vadim Petrochenkov (Aug 21 2019 at 19:23, on Zulip):
Vadim Petrochenkov (Aug 21 2019 at 19:23, on Zulip):

^^^ That's where we fill in the hygiene data associated with ExpnIds.

mark-i-m (Aug 21 2019 at 19:24, on Zulip):

When we put it back in the queue?

mark-i-m (Aug 21 2019 at 19:25, on Zulip):

or do you mean the collect step in general?

Vadim Petrochenkov (Aug 21 2019 at 19:25, on Zulip):

Once we resolved the macro call to the macro definition we know everything about the macro and can call set_expn_data to fill in its properties in the global data.

Vadim Petrochenkov (Aug 21 2019 at 19:25, on Zulip):

I mean, immediately after successful resolution.

Vadim Petrochenkov (Aug 21 2019 at 19:26, on Zulip):

That's the first part of hygiene data, the second one is associated with SyntaxContext rather than with ExpnId, it's filled in later during expansion.

Vadim Petrochenkov (Aug 21 2019 at 19:28, on Zulip):

So, after we run the macro's expander function and got a piece of AST (or got tokens and parsed them into a piece of AST) we need to integrate that piece of AST into the big existing partially built AST.

Vadim Petrochenkov (Aug 21 2019 at 19:29, on Zulip):

This integration is a really important step where the next things happen:
- NodeIds are assigned.

Vadim Petrochenkov (Aug 21 2019 at 19:30, on Zulip):
Vadim Petrochenkov (Aug 21 2019 at 19:31, on Zulip):
Vadim Petrochenkov (Aug 21 2019 at 19:32, on Zulip):

So, we are basically turning some vague token-like mass into proper set in stone hierarhical AST and side tables.

Vadim Petrochenkov (Aug 21 2019 at 19:34, on Zulip):

Where exactly this happens - NodeIds are assigned by InvocationCollector (which also collects new macro calls from this new AST piece and adds them to the queue), DefIds are created by DefCollector, and modules are filled by BuildReducedGraphVisitor.

Vadim Petrochenkov (Aug 21 2019 at 19:35, on Zulip):

These three passes run one after another on every AST fragment freshly expanded from a macro.

Vadim Petrochenkov (Aug 21 2019 at 19:37, on Zulip):

After expanding a single macro and integrating its output we again try to resolve all imports in the crate, and then return to the big queue processing loop and pick up the next macro.

Vadim Petrochenkov (Aug 21 2019 at 19:38, on Zulip):

Repeat until there's no more macros.

Vadim Petrochenkov (Aug 21 2019 at 19:38, on Zulip):

mark-i-m (Aug 21 2019 at 19:38, on Zulip):

The integration step is where we would get parser errors too right?

mark-i-m (Aug 21 2019 at 19:39, on Zulip):

Also, when do we know definitively that resolution has failed for particular ident?

Vadim Petrochenkov (Aug 21 2019 at 19:39, on Zulip):

The integration step is where we would get parser errors too right?

Yes, if the macro produced tokens (rather than AST directly) and we had to parse them.

Vadim Petrochenkov (Aug 21 2019 at 19:42, on Zulip):

when do we know definitively that resolution has failed for particular ident?

So, ident is looked up in a number of scopes during resolution.
From closest like the current block or module, to far away like preludes or built-in types.

Vadim Petrochenkov (Aug 21 2019 at 19:42, on Zulip):

If lookup is certainly failed in all of the scopes, then it's certainly failed.

mark-i-m (Aug 21 2019 at 19:43, on Zulip):

This is after all expansions and integrations are done, right?

Vadim Petrochenkov (Aug 21 2019 at 19:43, on Zulip):

"Certainly" is determined differently for different scopes, e.g. for a module scope it means no unexpanded macros and no unresolved glob imports in that module.

Vadim Petrochenkov (Aug 21 2019 at 19:44, on Zulip):

This is after all expansions and integrations are done, right?

For macro and import names this happens during expansions and integrations.

mark-i-m (Aug 21 2019 at 19:45, on Zulip):

Makes sense

Vadim Petrochenkov (Aug 21 2019 at 19:45, on Zulip):

For all other names we certainly know whether a name is resolved successfully or not on the first attempt, because no new names can appear.

Vadim Petrochenkov (Aug 21 2019 at 19:45, on Zulip):

(They are resolved in a later pass, see librustc_resolve/late.rs.)

mark-i-m (Aug 21 2019 at 19:45, on Zulip):

And if at the end of the iteration, there are still things in the queue that can't be resolve, this represents an error, right?

mark-i-m (Aug 21 2019 at 19:46, on Zulip):

i.e. an undefined macro?

Vadim Petrochenkov (Aug 21 2019 at 19:46, on Zulip):

Yes, if we make no progress during an iteration, then we are stuck and that state represent an error.

Vadim Petrochenkov (Aug 21 2019 at 19:47, on Zulip):

We attempt to recover though, using dummies expanding into nothing or ExprKind::Err or something like that for unresolved macros.

mark-i-m (Aug 21 2019 at 19:48, on Zulip):

This is for the purposes of diagnostics, though, right?

Vadim Petrochenkov (Aug 21 2019 at 19:48, on Zulip):

But if we are going through recovery, then compilation must result in an error anyway.

Vadim Petrochenkov (Aug 21 2019 at 19:49, on Zulip):

Yes, that's for diagnostics, without recovery we would stuck at the first unresolved macro or import.

Vadim Petrochenkov (Aug 21 2019 at 19:50, on Zulip):

So, about the SyntaxContext hygiene...

Vadim Petrochenkov (Aug 21 2019 at 19:51, on Zulip):

New syntax contexts are created during macro expansion.

Vadim Petrochenkov (Aug 21 2019 at 19:53, on Zulip):

If the token had context X before being produced by a macro, e.g. here ident has context SyntaxContext::root():

Vadim Petrochenkov (Aug 21 2019 at 19:53, on Zulip):
macro m() {
    ident
}
Vadim Petrochenkov (Aug 21 2019 at 19:54, on Zulip):

, then after being produced by the macro it has context X -> macro_id.

Vadim Petrochenkov (Aug 21 2019 at 19:55, on Zulip):

I.e. our ident has context ROOT -> id(m) after it's produced by m.

Vadim Petrochenkov (Aug 21 2019 at 19:56, on Zulip):

The "chaining operator" -> is apply_mark in compiler code.

Vadim Petrochenkov (Aug 21 2019 at 19:57, on Zulip):
macro m() {
    macro n() {
        ident
    }
}
Vadim Petrochenkov (Aug 21 2019 at 19:58, on Zulip):

In this example the ident has context ROOT originally, then ROOT -> id(m), then ROOT -> id(m) -> id(n).

Vadim Petrochenkov (Aug 21 2019 at 20:00, on Zulip):

Note that these chains are not entirely determined by their last element, in other words ExpnId is not isomorphic to SyntaxCtxt.

Vadim Petrochenkov (Aug 21 2019 at 20:00, on Zulip):

Couterexample:

Vadim Petrochenkov (Aug 21 2019 at 20:01, on Zulip):
macro m($i: ident) {
    macro n() {
        ($i, bar)
    }
}

m!(foo);
Vadim Petrochenkov (Aug 21 2019 at 20:02, on Zulip):

foo has context ROOT -> id(n) and bar has context ROOT -> id(m) -> id(n) after all the expansions.

mark-i-m (Aug 21 2019 at 20:05, on Zulip):

Cool :)

mark-i-m (Aug 21 2019 at 20:05, on Zulip):

It looks like we are out of time

mark-i-m (Aug 21 2019 at 20:05, on Zulip):

Is there anything you wanted to add?

mark-i-m (Aug 21 2019 at 20:06, on Zulip):

We can schedule another meeting if you would like

Vadim Petrochenkov (Aug 21 2019 at 20:07, on Zulip):

Yep, 23.06 already.
No, I think this is an ok point to stop.

mark-i-m (Aug 21 2019 at 20:07, on Zulip):

:+1:

mark-i-m (Aug 21 2019 at 20:07, on Zulip):

Thanks @Vadim Petrochenkov ! This was very helpful

Vadim Petrochenkov (Aug 21 2019 at 20:09, on Zulip):

Yeah, we can schedule another one.
So far it's been like 1 hour of meetings per month? Certainly not a big burden.

Last update: Nov 15 2019 at 20:40UTC