Stream: t-compiler/wg-rls-2.0

Topic: Lexer


std::Veetaha (Jan 23 2020 at 00:03, on Zulip):

Are we going to stick with using rustc_lexer crate in future or is this temporary?

matklad (Jan 23 2020 at 00:06, on Zulip):

This is the opposite of temporary

matklad (Jan 23 2020 at 00:07, on Zulip):

We probably should tweak our integration with rustc_lexer

matklad (Jan 23 2020 at 00:08, on Zulip):

and we might tweak the interface and scope of rustc_lexer (things like lowering an integer literal to an integer value ideally should be in rustc_lexer), but the overall structure should be roughly as it is today

std::Veetaha (Jan 23 2020 at 00:10, on Zulip):

And what would you want to tweak in our integration with rustc_lexer today?

matklad (Jan 23 2020 at 00:14, on Zulip):

At least not dropping errors on the floor. Quite probably something else: the current contents of lexer.rs is basically the minimal diff which was possible to apply to thrown away our previous bespoke lexer and replace it with rustc_lexer. For example,

            rustc_lexer::TokenKind::Ident => {
                let token_text = &text[..rustc_token.len];
                if token_text == "_" {
                    UNDERSCORE
                } else {
                    SyntaxKind::from_keyword(&text[..rustc_token.len]).unwrap_or(IDENT)
                }
            }

immediately jumps at me as something we might want to change at either our or rustc_lexer side

matklad (Jan 23 2020 at 00:15, on Zulip):

I don't have time right now to investigate what exactly needs to be changed and how :D

std::Veetaha (Jan 23 2020 at 00:23, on Zulip):

Yeah, I am currently working on that not-dropping-errors-on-the-floor issue. But didn't know we had another lexer implementation it the past. Regarding that UNDERSCORE token, I'll also look at it, but after I am done with errors preserving.

std::Veetaha (Jan 23 2020 at 00:36, on Zulip):

And by "lowering an integer literal to an integer value" do you mean parsing the integer token string representation like here? I also wonder whether this is legal at all (I mean there may be some rust-specific features of int literals that str::parse::<usize>() may not understand...)

matklad (Jan 23 2020 at 00:44, on Zulip):

That code is specifically for tupple suffixes, so it is likely more or less
ok. I mean something that makes sure that 4_2f32 is 42.0

std::Veetaha (Jan 23 2020 at 00:55, on Zulip):

Hmm, it's hard to find a place where we parse numeric literals value... And I don't see where it is implemented now.

matklad (Jan 23 2020 at 09:03, on Zulip):

we don't do that yet, but rustc does

std::Veetaha (Jan 23 2020 at 10:01, on Zulip):

Okay, I guess we would need to expose that as a public API on rustc_lexer crate. Today I'll create an issue for that at their repo in order not to forget about it.

matklad (Jan 23 2020 at 10:02, on Zulip):

I think it would be much easier forget about the fact that the issue exists than about the fact that we need to refactor this eventually

std::Veetaha (Jan 23 2020 at 10:06, on Zulip):

I add issues into my bookmarks, so won't forget about it)

std::Veetaha (Jan 23 2020 at 20:37, on Zulip):

Okay, so I created an issue at rustc repo, I hope someone will answer.

Last update: Jun 07 2020 at 08:50UTC