Stream: t-compiler/rust-analyzer

Topic: autoimport weirdness


Florian Diebold (Mar 16 2021 at 20:36, on Zulip):

autoimport seems to want to import methods from trait impls? e.g. in current master, in autoderef.rs, go inside a function and type from_chalk. It proposes two autoimports, hir_def::ImplId and hir_def::TraitId. those aren't even traits :/

Florian Diebold (Mar 16 2021 at 20:37, on Zulip):

also, if I remove the Interner use in the same module, it can't autoimport it again

Florian Diebold (Mar 16 2021 at 20:39, on Zulip):

OTOH, if I type Interner in diagnostics.rs (still in hir_ty), it proposes crate::Interner _twice_?

Jonas Schievink [he/him] (Mar 16 2021 at 21:05, on Zulip):

I also saw some missing items today

Kirill Bulatov (Mar 16 2021 at 21:07, on Zulip):

For me, Interner appears if I type it fully, so this is still https://github.com/rust-analyzer/rust-analyzer/issues/7902 but just for a different crate.

It is surprisingly many fuzzy-matching non-qualified names for such words as interner, and our current limit (40) gets too many associated items before the proper results.

Kirill Bulatov (Mar 16 2021 at 21:13, on Zulip):

Weird and duplicated appearances are most probably bugs in import_assets.rs, but there's always a slight chance that trait resolution is somehow involved :smile:

Great findings though, I can dig the later closer to the end of the week.

Florian Diebold (Mar 17 2021 at 06:41, on Zulip):

hmm interesting, Interner appears in the autoderef function, but not in autoderef_by_trait (except for the chalk Interner)

Laurențiu (Mar 17 2021 at 07:36, on Zulip):

Can we make it "smart-cased" or case-sensitive to get fewer results?

Florian Diebold (Mar 17 2021 at 07:40, on Zulip):

I find it a bit suspicious that there would be so many results

Laurențiu (Mar 17 2021 at 07:41, on Zulip):

I was looking at the screenshot in https://github.com/rust-analyzer/rust-analyzer/issues/7902 where we suggest atomic_xor_acq for Arc. But that seems too fuzzy, even for arc.

Florian Diebold (Mar 17 2021 at 08:01, on Zulip):

yeah... personally I would actually be fine with just prefix matching, but another option would also be splitting names (on _ and capital letters) and matching on starting letters of words (e.g. IntelliJ's class search for java lets you match ThingFactory with tf), or just making sure that these kinds of matches always come first

Kirill Bulatov (Mar 17 2021 at 09:56, on Zulip):

Currently, the search is based on fst lookup which stores the strings with lowercased fully qualified names + an integer key to get the associated data.
fst requres the sorted input strings, so the semantic ordering is messed up greatly, but we get a fast fuzzy lookup in return.
Then we apply some semantics aware post-filtering on top of all data found, so it's a bit tricky to make something to always come first.

We can work on making some match scoring of the results: in fst, there's Levenstein distance available, but for "starting letters of words" it should be something more complex?
After we derive the scoring, we keep the score-sorted interemediate collection to hold the data and decide when the collection is filled enough for the search.
We can also split off the assoc items from the rest or add whatever ¢æ special logic bytes in fst to make the original queries better.
Not sure if that's all feasible to do before some benchmarks appear first.

I like the idea of a more clever search though (i.e. more sophisticated post-filtering), that can be relatively simple and safe to do.
If we take the capital letters case, I have no idea what a "smart-cased" and "prefix matching" means alas.
But imagined we could require the fst results to contain all capital letters from the input, or simply make all searches case-sensitive here and see how it works out for you.

Search limits are also simple to bump in the same items_locator.rs.
I've experienced bad timings when set it up to usize::MAX, so we cannot just turn it off, but maybe we should move it to settings after all?

Laurențiu (Mar 17 2021 at 10:07, on Zulip):

Smart case is doing a case-sensitive match (only) if there are upper-case letters in the input. Doing a better post-filtering sounds reasonable, I guess.

Florian Diebold (Mar 17 2021 at 10:22, on Zulip):

IMO we should first look into those suspicious issues I mentioned at the beginning (if we're indexing methods from trait impls, that's a huge amount of results we can just get rid of)

Florian Diebold (Mar 17 2021 at 10:25, on Zulip):

regarding 'first letters of words' matching, I'm pretty sure it should be possible to represent that as an FST, but we'd have to implement it ourselves I guess

matklad (Mar 17 2021 at 10:26, on Zulip):

it should be possible to represent that as an FST, but we'd have to implement it ourselves I guess

This probably wants to be a separate layer on top of FST. we want to apply the same logic for usual completion ranking.

Kirill Bulatov (Mar 17 2021 at 10:26, on Zulip):

Yes, the weird results + the smart casing look very interesting, I'll check it out later, as planned.

Kirill Bulatov (Mar 17 2021 at 10:27, on Zulip):

it should be possible to represent that as an FST, but we'd have to implement it ourselves I guess

In that case, I'd love to drag into that new FST the "current crates" of the project and remove the symbol lookup we have instead.

Florian Diebold (Mar 17 2021 at 10:31, on Zulip):

matklad said:

it should be possible to represent that as an FST, but we'd have to implement it ourselves I guess

This probably wants to be a separate layer on top of FST. we want to apply the same logic for usual completion ranking.

if we can construct an FST, we can probably use that to match completions as well. I think implementing it as an FST would be more performant and get rid of those "too many results, so the interesting ones were outside the search limit" problems

Florian Diebold (Mar 17 2021 at 10:33, on Zulip):

btw, I just saw that the fst crate says

The construction of Levenshtein automatons should be consider "proof of concept" quality. Namely, they do just enough to be correct. But they haven't had any effort put into them to be memory conscious.

:grimacing:

Florian Diebold (Mar 17 2021 at 17:17, on Zulip):

btw, as a positive side-note, autoimport completions now work nicely in Emacs (they needed some fixes in lsp-mode)

Kirill Bulatov (Mar 19 2021 at 16:25, on Zulip):

in current master, in autoderef.rs, go inside a function and type from_chalk.

Come to finally think of it: what would you expect to be completed in this case?
Afaik, even for associated functions, you have to specify some non-trait type to call the actual method?
So just completing from_chalk with a trait auto import will produce the code that does not compile, won't it?

What if we disallow such completions entirely?

Florian Diebold (Mar 19 2021 at 16:37, on Zulip):

there's a pub(crate) fn from_chalk in chalk.rs

Florian Diebold (Mar 19 2021 at 16:37, on Zulip):

all the trait methods are a red herring here ;)

Florian Diebold (Mar 19 2021 at 16:38, on Zulip):

so yeah, I'd expect only that one

Kirill Bulatov (Mar 19 2021 at 16:59, on Zulip):

I now understand what's happening with the non-trait suggestions: import_map is indeed fine, indexing only trait-related assoc items.
But from_chalk is the part of the current project, it's not held in the import_map: instead, it's held in the text index that's searched separately (i.e. project's imports and external crates' imports are searched separately).
There, for from_chalk, plenty of assoc items returned, including the non-trait ones.

Then, after fiddling with flyimport completion with path prefixes, I've forgotten about this and let such assoc items slip into and allowed such completions in general.

Kirill Bulatov (Mar 19 2021 at 23:09, on Zulip):

On the other hand, there can be associated items and imports with zero traits involved, so the way we do it in import_map isn't also 100% correct?

mod foo {
    pub mod bar {
        pub struct Item;

        impl Item {
            pub const TEST_ASSOC: usize = 3;
        }
    }
}

use crate::foo::bar;

fn main() {
    let _thee = bar::Item::TEST_ASSOC;
}
Jonas Schievink [he/him] (Mar 19 2021 at 23:21, on Zulip):

these are always accessed through the type, I believe

Kirill Bulatov (Mar 20 2021 at 23:36, on Zulip):

Just to sum things up with https://github.com/rust-analyzer/rust-analyzer/pull/8123

Last update: Jul 24 2021 at 20:00UTC