Stream: t-lang

Topic: Reserving a keyword for asm in the new edition


Amanieu (Apr 21 2020 at 18:26, on Zulip):

I missed the meeting, but the point about reserving a keyword for asm is based on this issue by @Vadim Petrochenkov: https://github.com/rust-lang/project-inline-asm/issues/8

Amanieu (Apr 21 2020 at 18:27, on Zulip):

Macros expanding directly into internal compiler structures (AST) are rather an exception in rustc than a rule.
The compiler frontend is in a process of migration to a token-based model (inspired by proc macros) where each macro expands to some actual non-macro syntax (token stream).

For inline assembly the surface syntax of that non-macro representation is not important right now, except that it should be unambiguous in the expression position, like extern break box { ... } or whatever.
It can probably be wordy or unwieldy to write manually, but not necessarily.

What's important in my opinion is ability to lower different surface asm syntaxes into this common non-macro representation.
Basically, if you are bikeshedding some asm!(syntax1) vs asm!(syntax2) alternative, make sure that both can be converted into a common non-macro form, so that the asm!(syntax2) alternative could always be implemented just as a different proc macro my_asm!(syntax2).

Josh Triplett (Apr 21 2020 at 18:36, on Zulip):

Interesting. What's the underlying reason for that?

Josh Triplett (Apr 21 2020 at 18:36, on Zulip):

Why does it matter that asm! be a macro that expands to tokens, rather than a first-class thing the compiler understands?

Amanieu (Apr 21 2020 at 18:43, on Zulip):

Basically this part:

The compiler frontend is in a process of migration to a token-based model (inspired by proc macros) where each macro expands to some actual non-macro syntax (token stream).

Josh Triplett (Apr 21 2020 at 18:46, on Zulip):

That's what I'm trying to understand. Why can the token not just be asm! ?

Josh Triplett (Apr 21 2020 at 18:46, on Zulip):

Or some other token that can't be written in the Rust language?

nikomatsakis (Apr 21 2020 at 18:47, on Zulip):

fwiw, I've been wanting to write a blog post

Josh Triplett (Apr 21 2020 at 18:47, on Zulip):

I get the idea of separating macro expansion from tokens, but why can't we just add an appropriate internal-use-only token for it to expand to?

nikomatsakis (Apr 21 2020 at 18:47, on Zulip):

that advocates for us to sometimes start with the approach of adding (stable) macros

nikomatsakis (Apr 21 2020 at 18:47, on Zulip):

probably with creepy names that aren't meant to be used by end users

Josh Triplett (Apr 21 2020 at 18:48, on Zulip):

@nikomatsakis s/aren't meant to be/can't be/ please? :)

nikomatsakis (Apr 21 2020 at 18:48, on Zulip):

with the intention that folks can use proc macros to experiment with syntaxes and so forth

nikomatsakis (Apr 21 2020 at 18:48, on Zulip):

no, I wrote what I meant

Josh Triplett (Apr 21 2020 at 18:48, on Zulip):

Ah.

Josh Triplett (Apr 21 2020 at 18:48, on Zulip):

I see.

nikomatsakis (Apr 21 2020 at 18:48, on Zulip):

well

pnkfelix (Apr 21 2020 at 18:48, on Zulip):

I vote for unsafe box { ... } to be the new replacement for asm!{ ... }

nikomatsakis (Apr 21 2020 at 18:48, on Zulip):

I think it's plausible for proc macros to generate those magic idents

nikomatsakis (Apr 21 2020 at 18:48, on Zulip):

I guess that would be ok too

nikomatsakis (Apr 21 2020 at 18:48, on Zulip):

I don't care too much about that detail, really

nikomatsakis (Apr 21 2020 at 18:48, on Zulip):

I forget where this came up recently

Josh Triplett (Apr 21 2020 at 18:49, on Zulip):

@nikomatsakis I understand the goal now. But I'm hoping that the internal-use token can be unstable even if the macro syntax is stable?

nikomatsakis (Apr 21 2020 at 18:49, on Zulip):

I guess maybe around yeet/throws? I was thinking about it as a plausible route for something

pnkfelix (Apr 21 2020 at 18:49, on Zulip):

(i.e., it seems very silly to me to reserve a whole new keyword for something that isn't meant to be exposed to end users in the first place.)

nikomatsakis (Apr 21 2020 at 18:49, on Zulip):

@Josh Triplett I guess? that's an impl detail

nikomatsakis (Apr 21 2020 at 18:49, on Zulip):

the main thing is that I would want you to be able to use those proc macros on stable

Josh Triplett (Apr 21 2020 at 18:49, on Zulip):

Right.

nikomatsakis (Apr 21 2020 at 18:49, on Zulip):

and then maybe the proc macros get deprecated once there is a real lang feature that reaplces them

nikomatsakis (Apr 21 2020 at 18:49, on Zulip):

but the code keeps working

nikomatsakis (Apr 21 2020 at 18:49, on Zulip):

it's just "old"

Josh Triplett (Apr 21 2020 at 18:49, on Zulip):

So, for instance, the tokens could just as easily be something like #[rustc_internal_asm] {}.

nikomatsakis (Apr 21 2020 at 18:50, on Zulip):

right

Josh Triplett (Apr 21 2020 at 18:50, on Zulip):

:+1:

pnkfelix (Apr 21 2020 at 18:50, on Zulip):

oh that's good

nikomatsakis (Apr 21 2020 at 18:50, on Zulip):

that said, for assembly

nikomatsakis (Apr 21 2020 at 18:50, on Zulip):

it seems like we might be leap-frogging this state

nikomatsakis (Apr 21 2020 at 18:50, on Zulip):

and I'd probably be ok with just asm

nikomatsakis (Apr 21 2020 at 18:50, on Zulip):

but ...

Josh Triplett (Apr 21 2020 at 18:51, on Zulip):

But I can see the general case, and I'd be all for a #[rustc_internal_yeet] attribute.

nikomatsakis (Apr 21 2020 at 18:55, on Zulip):

it occurs to me that some kind of "namespace" like foo::xxx! (..) could be used for this

nikomatsakis (Apr 21 2020 at 18:55, on Zulip):

if we can just find some existing keyword that is suitable ;)

nikomatsakis (Apr 21 2020 at 18:55, on Zulip):

static::xxx! of course, per the C++ tradition

nikomatsakis (Apr 21 2020 at 18:55, on Zulip):

actually, that'd be kind of funny :)

Josh Triplett (Apr 21 2020 at 19:07, on Zulip):

::meta::rustc_internal:: ?

Josh Triplett (Apr 21 2020 at 19:07, on Zulip):

We did reserve ::meta for just such an occasion. :)

nikomatsakis (Apr 21 2020 at 19:48, on Zulip):

I still like ::static::asm!(), but yes :)

Josh Triplett (Apr 21 2020 at 19:59, on Zulip):

Does that mean proc macros would have to write ::r#static::?

nikomatsakis (Apr 22 2020 at 13:33, on Zulip):

well I imagined it as hack-y keyword in the grammar

Vadim Petrochenkov (Apr 22 2020 at 16:42, on Zulip):

@Josh Triplett

That's what I'm trying to understand. Why can the token not just be asm! ?

Well, because you get infinite recursion.
Macros produce tokens, tokens are then parsed, if the tokens are parsed into the same thing that produced it that's not good.

Vadim Petrochenkov (Apr 22 2020 at 16:44, on Zulip):

With any kind of eager expansion, user-defined macros will need to be able to expand arbitrary macro calls to their non-macro forms as well, including asm.

Vadim Petrochenkov (Apr 22 2020 at 16:45, on Zulip):

With sufficient amount of hacks we could keep expanding asm! into asm!, but why would we want it, if we could expand it into some unstable but proper syntax instead.

Vadim Petrochenkov (Apr 22 2020 at 16:53, on Zulip):

I'm not sure what benefits we get from using the path syntax like ::static::asm for asm blocks when they are clearly not paths.
(Which would mean either exceptions in the parser to not treat ::static::asm as a path, or hacks in resolve to not resolve paths if they look like ::static::asm.)
You can compose an unambiguous proper syntax like static asm { ... } from the same identifiers instead.

Vadim Petrochenkov (Apr 22 2020 at 16:55, on Zulip):

static looks more like something related to global_asm! though :slight_smile:

Josh Triplett (Apr 22 2020 at 16:56, on Zulip):

Niko already explained the use case, so I understand why it needs to be a separate symbol. I do think it makes sense to use something like #[rustc_internal_asm] though, so we don't need separate reserved keywords.

Vadim Petrochenkov (Apr 22 2020 at 16:59, on Zulip):

If asm is preceded by any other keyword, then we don't need to make it reserved itself.

Josh Triplett (Apr 22 2020 at 17:00, on Zulip):

But that would then require prefixing it with some other keyword.

Vadim Petrochenkov (Apr 22 2020 at 17:01, on Zulip):

Which is not a problem because it's unstable and only generated by asm! in most cases.

Josh Triplett (Apr 22 2020 at 17:01, on Zulip):

Rather than artificially constructing a syntax just for the parser to re-digest, and telling people writing macros to generate that, we could use a straightforward, unique, clearly internal syntax.

Vadim Petrochenkov (Apr 22 2020 at 17:01, on Zulip):

The suggestions with paths and attributes are anything but straightforward though.

Vadim Petrochenkov (Apr 22 2020 at 17:03, on Zulip):

Even if the underlying syntax ends up getting stabilized somewhere in Rust 2024, double keyword syntaxes like unsafe asm { ... } and static unsafe asm { ... } for global asm still look pretty good.

Vadim Petrochenkov (Apr 22 2020 at 17:04, on Zulip):

Plus Rust 2024 could reserve a single new keyword, but that's too much speculation already.

Josh Triplett (Apr 22 2020 at 17:04, on Zulip):

I don't expect the underlying syntax to get stabilized; it's there for experimentation in nightly.

Josh Triplett (Apr 22 2020 at 17:04, on Zulip):

But it'd be nice if it weren't something artificial.

Josh Triplett (Apr 22 2020 at 17:05, on Zulip):

Also, a syntax like unsafe asm closes off future language parsing possibilities. For instance, we've had requests for brace-less unsafe before.

Vadim Petrochenkov (Apr 22 2020 at 17:05, on Zulip):

Summary of my opinion: some arbitrary unstable but proper syntax is the most straightforward and least hacky way to resolve the linked issue.

Josh Triplett (Apr 22 2020 at 17:05, on Zulip):

Or, for that matter, &unsafe arbitrary_name as a raw pointer syntax.

Josh Triplett (Apr 22 2020 at 17:06, on Zulip):

I'm curious: what in the parser makes #[some_arbitrary_attribute] painful to deal with? Is there some other extensible syntax that's more easily parsed later on?

Vadim Petrochenkov (Apr 22 2020 at 17:07, on Zulip):

#[some_arbitrary_attribute] is potentially a macro invocation itself, it's more a "semantic" than "syntactic" entity.

Josh Triplett (Apr 22 2020 at 17:14, on Zulip):

Fair.

Josh Triplett (Apr 22 2020 at 17:15, on Zulip):

Another possibility that seems like it might be more extensible: extern "rustc_asm" { ... arbitrary ... } ?

Sebastian Malton (Apr 22 2020 at 17:48, on Zulip):

That sort of makes sense, since the asm is external to rust

Vadim Petrochenkov (Apr 22 2020 at 18:16, on Zulip):

Yeah, something like keyword literal is ok too.

Vadim Petrochenkov (Apr 22 2020 at 18:18, on Zulip):

(extern "rustc_asm" { specifically needs some relatively large lookahead to discern if from an extern block item though.)

Josh Triplett (Apr 22 2020 at 18:19, on Zulip):

Fair. What would be easiest to parse and already has arbitrary extensibility without closing off future parsing avenues?

Amanieu (Apr 22 2020 at 18:27, on Zulip):

I suggested extern asm in the issue.

Amanieu (Apr 22 2020 at 18:28, on Zulip):

But as you said, it doesn't matter what syntax it expands to: since it's unstable, we can change it whenever we want.

nikomatsakis (Apr 22 2020 at 20:06, on Zulip):

@Vadim Petrochenkov to clarify, I don't particularly care what we do, but what I was saying is that I would like to have some way for us to introduce stable syntax that is not meant for end-users to use, but only to be targeted by procedural macros. This would correspond to us exposing base capabilities but where we don't know yet what the full user syntax should be.

I am somewhat inspired by Ember's model, where they explicitly introduce functions and core capabilities,, and then encourage people to experiment "in user space" with different ways to expose those capabilities, and then later (in what they call an "Edition", obviously distinct from how we use the term) they survey those terms, synthesize one into the "final" result, and release it.

But the key point is that those "user space experiments" still work, because they were based on underlying capabilities.

Anyway, when I proposed e.g., _::foo as a syntax, it wasn't meant to be parsed as a path -- rather, when we start parsing paths, we would presumably look ahead for _ followed by :: and, if we see those two tokens, we would look at the next keyword and parse it directly into a foo AST node.

nikomatsakis (Apr 22 2020 at 20:07, on Zulip):

So e.g. if we wanted to export asm! as such a "core capability" it might be _::asm. Anyway, it doesn't have to be _::, that's not the high-order bit, I mostly just wanted to find some bit of syntax that has no meaning and that we're happy committing to it having no meaning.

nikomatsakis (Apr 22 2020 at 20:07, on Zulip):

Though for all I know that is legal in some way :)

nikomatsakis (Apr 22 2020 at 20:08, on Zulip):

Part of my motivation here is that I think a good way to help defuse hot syntax debates is to have people gain actual experience using the various alternatives.

nikomatsakis (Apr 22 2020 at 20:08, on Zulip):

At least it always clarifies my opinions immensly.

Sebastian Malton (Apr 22 2020 at 20:12, on Zulip):

That makes a lot of sense, and I know that you said that _:: isn't the point. However, it is one of the main contenders listed in https://internals.rust-lang.org/t/bring-enum-variants-in-scope-for-patterns/12104.

I think that getting people to use various syntaxes would be useful, like with .await

Charles Lew (Apr 23 2020 at 02:10, on Zulip):

Maybe we can reserve a contextual keyword called "experimental", and allow Vis experimental {} indefinitely.

#[library_defined_asm]
pub experimental {
        asm! {
        }
}

And let the macro rewrite the whole inner token stream into anything it's happy with.

nikomatsakis (Apr 27 2020 at 13:19, on Zulip):

@Sebastian Malton good point re: _:: and the enum variants

nikomatsakis (Apr 27 2020 at 13:19, on Zulip):

another potion would be to grab some sigil like %foo

Josh Triplett (Apr 27 2020 at 14:56, on Zulip):

That has a meaning already.

nikomatsakis (Apr 27 2020 at 14:57, on Zulip):

What meaning is that?

nikomatsakis (Apr 27 2020 at 14:57, on Zulip):

I think it has a meaning as a binary operator, but as a unary one?

Josh Triplett (Apr 27 2020 at 14:57, on Zulip):

Would that be unambiguous, though?

nikomatsakis (Apr 27 2020 at 14:57, on Zulip):

In any case, you can insert "favorite sigil here".. e.g. $

nikomatsakis (Apr 27 2020 at 14:57, on Zulip):

sure, same way that -22 and 22 - 44 is not ambiguous

nikomatsakis (Apr 27 2020 at 14:58, on Zulip):

but I think picking a fully unused sigil is probably preferable

Josh Triplett (Apr 27 2020 at 14:59, on Zulip):

Right, but that means the parser goes from "I can tell that's an error from the first token" to "expected asm after %", which seems like it'd confuse people.

Josh Triplett (Apr 27 2020 at 14:59, on Zulip):

Agreed.

Josh Triplett (Apr 27 2020 at 15:00, on Zulip):

An unused sigil or a keyword that doesn't have a meaning in this context.

nikomatsakis (Apr 27 2020 at 15:00, on Zulip):

heck we could even actually use emoji :)

Josh Triplett (Apr 27 2020 at 15:12, on Zulip):

Ow

Sebastian Malton (Apr 27 2020 at 17:35, on Zulip):

Actually, using emoji for bikeshedding in general is probably not a terrible idea, since it makes it very clear that it is not a final design

nikomatsakis (Apr 27 2020 at 18:34, on Zulip):

Josh Triplett said:

An unused sigil or a keyword that doesn't have a meaning in this context.

\foo?

Amanieu (Apr 27 2020 at 20:23, on Zulip):

Anyways, I guess the core question is resolved: we don't need to reserve a new keyword and can just continue using the macro as the main entry point for inline assembly.

Amanieu (Apr 27 2020 at 20:23, on Zulip):

The exact syntax we choose doesn't really matter.

Last update: Jun 05 2020 at 22:10UTC