Stream: t-compiler

Topic: Implementing `confusable_idents`


Charles Lew (Jan 04 2020 at 12:49, on Zulip):

So i'll start looking into how to implement confusable_idents and mixed_script_confusables, whichever is easier to implement.

Charles Lew (Jan 04 2020 at 12:51, on Zulip):

One immediate question i come up with, is that these two lints are kind of "accumulative", where each identifiers it meets is compared with everything it has seen before.

Charles Lew (Jan 04 2020 at 12:52, on Zulip):

Since there's ongoing "parallel rustc" efforts going on, this makes me wonder how would i store the previous states.

Charles Lew (Jan 04 2020 at 12:55, on Zulip):

I think i needs some instructions here again... @centril @Manish Goregaokar @Zoxc
https://github.com/rust-lang/rust/issues/55467

Manish Goregaokar (Jan 04 2020 at 12:55, on Zulip):

Yeah no I wouldn't touch those yet

Manish Goregaokar (Jan 04 2020 at 12:56, on Zulip):

@Esteban Küber said he had a wip branch for some of this

Manish Goregaokar (Jan 04 2020 at 12:56, on Zulip):

Mixed script confusables is one you might want to try, the rfc suggests a way to make it super efficient

Manish Goregaokar (Jan 04 2020 at 12:56, on Zulip):

I'd just leave the confusables lint for now

Charles Lew (Jan 04 2020 at 12:58, on Zulip):

Ok. I think i'll just find something else to play with for now. :slight_smile:

Charles Lew (Jan 04 2020 at 12:59, on Zulip):

I have also read some code about "Adjustments to "bad style" lints" item

Charles Lew (Jan 04 2020 at 12:59, on Zulip):

i think there's nothing that needs to be changed, if i've read it correctly.

Charles Lew (Jan 04 2020 at 13:01, on Zulip):

Maybe we need to review the progress and tick some checkboxes in the tracking issue at some point.

Charles Lew (Jan 04 2020 at 13:01, on Zulip):

A fast way to implement this is to compute skeleton for each identifier once and place the result in a hashmap as a key. If one tries to insert a key that already exists check if the two identifiers differ from each other. If so report the two confusable identifiers.

Charles Lew (Jan 04 2020 at 13:03, on Zulip):

So it's based on a hashmap, and it still has parallelism issues. Maybe i'll wait for @Esteban Küber at this point.

Esteban Küber (Jan 04 2020 at 15:03, on Zulip):

I don't think the parser is in the critical path for compilation speed in any existing non heavily nested project, I have a few with generated code that go in the hundreds of thousands of lines and the problem there is after the ast is created

Zoxc (Jan 04 2020 at 15:04, on Zulip):

Can't this just walk the entire crate at a later point and look at all identifiers instead of carrying state around?

Esteban Küber (Jan 04 2020 at 15:04, on Zulip):

And for heavy meeting the problem is blowing the stack on heavy recurtion

Esteban Küber (Jan 04 2020 at 15:05, on Zulip):

We can but we'd also want to use this for ident not found suggestions

Esteban Küber (Jan 04 2020 at 15:08, on Zulip):

We do have some big rwlocks already, right?

Esteban Küber (Jan 04 2020 at 15:09, on Zulip):

That might not go away anytime soon

Zoxc (Jan 04 2020 at 15:10, on Zulip):

When do ident not found suggestions happen? I assume we can collect all idents after macro expansion.

Zoxc (Jan 04 2020 at 15:11, on Zulip):

You can't really use locks for this kind of state since it would be non-deterministic and also not tracked by incremental compilation.

bjorn3 (Jan 04 2020 at 19:08, on Zulip):

When do ident not found suggestions happen? I assume we can collect all idents after macro expansion.

Then you will miss the identifiers of expanded macros.

Charles Lew (Jan 05 2020 at 16:19, on Zulip):

Feel free to ping me if there's any actionable item on my side. I'm happy to help push this implementation work forward.

Last update: May 26 2020 at 09:55UTC