Stream: t-compiler/rust-analyzer

Topic: VFS eagerly loads all .rs files in the project directory


Joshua Nelson (Jan 18 2021 at 18:40, on Zulip):

is there a way to tell RA not to look in hidden folders?

Joshua Nelson (Jan 18 2021 at 18:40, on Zulip):

docs.rs really does need those folders to run

Joshua Nelson (Jan 18 2021 at 18:40, on Zulip):

they aren't in cargo.toml, rust-analyzer is picking them up on its own somehow

Jonas Schievink [he/him] (Jan 18 2021 at 18:42, on Zulip):

I see it's in workspace.exclude already, perhaps we should respect that

Joshua Nelson (Jan 18 2021 at 18:44, on Zulip):

we had to patch tera to work around this, it was taking multiple minutes to startup because globwalk would recurse into ignored directories

Joshua Nelson (Jan 18 2021 at 18:46, on Zulip):

https://github.com/rust-lang/docs.rs/pull/861/files#r446669961

matklad (Jan 18 2021 at 18:46, on Zulip):

We have support for that internallyDirectories::exclude , but the plumbing is missing

Joshua Nelson (Jan 18 2021 at 18:52, on Zulip):

I'm interested in fixing this if you can give me some instructions on getting started :)

Jonas Schievink [he/him] (Jan 18 2021 at 18:56, on Zulip):

(looking)

Joshua Nelson (Jan 18 2021 at 18:57, on Zulip):

/me has all today free because of the holiday

Jonas Schievink [he/him] (Jan 18 2021 at 18:59, on Zulip):

I think you'll have to add a workspace-wide exclude field to CargoWorkspace.

Then this code needs to be changed to append the workspace's exclude list to dirs.exclude.

Jonas Schievink [he/him] (Jan 18 2021 at 18:59, on Zulip):

Some conversion of path types might be needed

Joshua Nelson (Jan 18 2021 at 22:17, on Zulip):

Jonas Schievink said:

I think you'll have to add a workspace-wide exclude field to CargoWorkspace.

Then this code needs to be changed to append the workspace's exclude list to dirs.exclude.

@Jonas Schievink I'm confused - isn't that already happening here? https://github.com/rust-analyzer/rust-analyzer/blob/08efb8a94360735f505699021eaa901464c76f4a/crates/rust-analyzer/src/reload.rs#L335
Or is root.exclude different from the workspace?

Jonas Schievink [he/him] (Jan 18 2021 at 22:18, on Zulip):

root is from .to_roots()

Jonas Schievink [he/him] (Jan 18 2021 at 22:18, on Zulip):

It's per-package I think

Joshua Nelson (Jan 18 2021 at 22:18, on Zulip):

oh I see, it doesn't include the workspace itself

Joshua Nelson (Jan 18 2021 at 22:18, on Zulip):

ok, makes sense

Joshua Nelson (Jan 18 2021 at 22:19, on Zulip):

should I handle this for JSON projects too? Or only ProjectWorkspace::Cargo?

Jonas Schievink [he/him] (Jan 18 2021 at 22:20, on Zulip):

just Cargo should be fine for now

Jonas Schievink [he/him] (Jan 18 2021 at 22:20, on Zulip):

rust-project.json can already specify per-package include and exclude lists

Jonas Schievink [he/him] (Jan 18 2021 at 22:20, on Zulip):

so external build systems could just use that

Joshua Nelson (Jan 18 2021 at 22:22, on Zulip):

should I use PathBuf or AbsPathBuf for the exclude?

Jonas Schievink [he/him] (Jan 18 2021 at 22:25, on Zulip):

AbsPathBuf seems to make sense, the other code paths also use that

Joshua Nelson (Jan 18 2021 at 22:26, on Zulip):

hmm, I'm confused how I actually get this info from cargo_metadata - I don't see exclude on either Metadata or Package

Joshua Nelson (Jan 18 2021 at 22:26, on Zulip):

I do see workspace_members, maybe I could convince RA to use that directly?

Jonas Schievink [he/him] (Jan 18 2021 at 22:27, on Zulip):

hmm, perhaps Cargo doesn't output the relevant data?

Joshua Nelson (Jan 18 2021 at 22:27, on Zulip):

let me start by seeing what packages cargo-metadata is actually picking up

Jonas Schievink [he/him] (Jan 18 2021 at 22:27, on Zulip):

I suppose it is only meant to help Cargo with workspace discovery

Jonas Schievink [he/him] (Jan 18 2021 at 22:28, on Zulip):

cargo metadata should respect those settings already

Joshua Nelson (Jan 18 2021 at 22:29, on Zulip):

ok something is very fishy because cargo metadata reports 412 packages whether .rustwide exists or not

Joshua Nelson (Jan 18 2021 at 22:29, on Zulip):

so somehow RA is getting packages from a source other than cargo metadata

Joshua Nelson (Jan 18 2021 at 22:30, on Zulip):

I wish I could diff the memory usage between two RA processes somehow :/

Jonas Schievink [he/him] (Jan 18 2021 at 22:31, on Zulip):

diff /proc/<pid1>/mem /proc/pid2/mem

Jonas Schievink [he/him] (Jan 18 2021 at 22:32, on Zulip):

I think the problem here is that r-a subscribes to every directory containing a Cargo project in the workspace

Jonas Schievink [he/him] (Jan 18 2021 at 22:32, on Zulip):

Because we want to listen to file changes, and load all .rs files into the VFS

Jonas Schievink [he/him] (Jan 18 2021 at 22:33, on Zulip):

cargo metadata doesn't tell us which files are part of the projects (it can't)

Jonas Schievink [he/him] (Jan 18 2021 at 22:33, on Zulip):

so we have to assume that all .rs files in the project dir are part of it

Jonas Schievink [he/him] (Jan 18 2021 at 22:33, on Zulip):

and even that isn't enough, since it fails to account for complex #[path] usage, which makes std::arch fail to resolve

Joshua Nelson (Jan 18 2021 at 22:34, on Zulip):

can you get it by walking the mod tree maybe?

Joshua Nelson (Jan 18 2021 at 22:34, on Zulip):

maybe I should read docs/dev

Jonas Schievink [he/him] (Jan 18 2021 at 22:35, on Zulip):

yeah, I'd love to be able to just start with lib.rs and gradually discover files, but I think that conflicts with other goals

Jonas Schievink [he/him] (Jan 18 2021 at 22:36, on Zulip):

huh this might also mean that include_str!/include_bytes! with any non-.rs file won't ever work

Joshua Nelson (Jan 18 2021 at 22:36, on Zulip):

Jonas Schievink said:

huh this might also mean that include_str!/include_bytes! with any non-.rs file won't ever work

without walking the mod tree, you mean?

Jonas Schievink [he/him] (Jan 18 2021 at 22:36, on Zulip):

with the current setup, yes

Joshua Nelson (Jan 18 2021 at 22:37, on Zulip):

let me try this :D

Joshua Nelson (Jan 18 2021 at 22:38, on Zulip):

hmm actually I don't know how to test - RA knows that include_str is always a string even if it doesn't know the contents

Joshua Nelson (Jan 18 2021 at 22:39, on Zulip):

image.png

Joshua Nelson (Jan 18 2021 at 22:41, on Zulip):

so what I'm hearing is this requires major architectural changes

Joshua Nelson (Jan 18 2021 at 22:41, on Zulip):

hey @matklad :D

Jonas Schievink [he/him] (Jan 18 2021 at 22:41, on Zulip):

the actual contents don't matter much in practice, which is good for us

Jonas Schievink [he/him] (Jan 18 2021 at 22:41, on Zulip):

many built-in derives are just dummys

Joshua Nelson (Jan 18 2021 at 22:41, on Zulip):

Joshua Nelson said:

hey matklad :D

how do you feel about walking about the mod tree instead of picking up every .rs file in the project directory? https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Fwg-rls-2.2E0/topic/running.20RA.20on.20docs.2Ers/near/223169548

Joshua Nelson (Jan 18 2021 at 22:42, on Zulip):

hmm I wonder if rustc could tell us which files are part of the project

Joshua Nelson (Jan 18 2021 at 22:42, on Zulip):

isn't there --emit=dep-info or something?

Joshua Nelson (Jan 18 2021 at 22:43, on Zulip):

oh well look at that:

> rustc --emit=dep-info src/main.rs
> cat main.d
main.d: src/main.rs src/x.txt

src/main.rs:
src/x.txt:
Joshua Nelson (Jan 18 2021 at 22:44, on Zulip):

seems annoying to parse though

Jonas Schievink [he/him] (Jan 18 2021 at 22:44, on Zulip):

how would you load files from within compiler crates though?

Jonas Schievink [he/him] (Jan 18 2021 at 22:44, on Zulip):

you can't just load from disk, you might have to ask the client

Joshua Nelson (Jan 18 2021 at 22:45, on Zulip):

I don't follow

Joshua Nelson (Jan 18 2021 at 22:45, on Zulip):

when would a file not be on disk?

Jonas Schievink [he/him] (Jan 18 2021 at 22:45, on Zulip):

when it's being edited

Jonas Schievink [he/him] (Jan 18 2021 at 22:45, on Zulip):

rust-analyzer has to look at the in-editor file, not what's on disk

Joshua Nelson (Jan 18 2021 at 22:45, on Zulip):

oh oof

Joshua Nelson (Jan 18 2021 at 22:45, on Zulip):

can rust-analyzer associate the in-memory file with the filename somehow?

Jonas Schievink [he/him] (Jan 18 2021 at 22:46, on Zulip):

I was thinking it'd be possible to basically "register interest" in a path from the compiler crates, and fetch those files from outside salsa

Jonas Schievink [he/him] (Jan 18 2021 at 22:46, on Zulip):

And feed their contents back in via FileTextQuery, like we do now

Jonas Schievink [he/him] (Jan 18 2021 at 22:46, on Zulip):

However, this is cursed

Joshua Nelson (Jan 18 2021 at 22:47, on Zulip):

what makes that cursed?

Jonas Schievink [he/him] (Jan 18 2021 at 22:47, on Zulip):

At least it doesn't feel right to me to use an external fixed-point loop to drive the whole compiler

Joshua Nelson (Jan 18 2021 at 22:47, on Zulip):

:shrug: that's how mod trees work though

Jonas Schievink [he/him] (Jan 18 2021 at 22:47, on Zulip):

it'd probably also be slow

Joshua Nelson (Jan 18 2021 at 22:47, on Zulip):

theoretically I think you could have them go on forever if you have

// x.rs
#[path = "x.rs"]
mod x;

or something

Joshua Nelson (Jan 18 2021 at 22:48, on Zulip):

yup:

error: circular modules: recurse.rs -> recurse.rs
 --> recurse.rs:3:1
  |
3 | mod x;
  | ^^^^^^
Joshua Nelson (Jan 18 2021 at 22:50, on Zulip):

here's another idea: rather than reading every .rs file into memory, could you just register that it exists? and load the contents lazily?

Jonas Schievink [he/him] (Jan 18 2021 at 22:51, on Zulip):

Jonas Schievink said:

I was thinking it'd be possible to basically "register interest" in a path from the compiler crates, and fetch those files from outside salsa

that's what I meant to say with this

Joshua Nelson (Jan 18 2021 at 22:52, on Zulip):

neat, that sounds easier than redesigning the whole abstraction

Joshua Nelson (Jan 18 2021 at 22:52, on Zulip):

where does the actual file loading happen?

Joshua Nelson (Jan 18 2021 at 22:52, on Zulip):

vfs?

Joshua Nelson (Jan 18 2021 at 22:54, on Zulip):

oh no the contents is already an Option, that means we'll have Option<Option> :laughing:

Jonas Schievink [he/him] (Jan 18 2021 at 22:58, on Zulip):

vfs::loader::Handle is the file loading interface

Joshua Nelson (Jan 18 2021 at 22:59, on Zulip):

ok, load_sync is the thing actually reading files off disk

Jonas Schievink [he/him] (Jan 18 2021 at 22:59, on Zulip):

well, the one for on-disk files at least

Joshua Nelson (Jan 18 2021 at 22:59, on Zulip):

which is implemented in vfs-notify

Joshua Nelson (Jan 18 2021 at 23:00, on Zulip):
                if !mem_docs.contains_key(&vfs_path) {
                    let contents = loader.handle.load_sync(path);
                    vfs.set_file_contents(vfs_path.clone(), contents);
                }

does this need to call set_file_contents immediately?

Joshua Nelson (Jan 18 2021 at 23:01, on Zulip):

hmm that's actually called in not many places

crates/vfs/src/lib.rs
5://! [`set_file_contents`]. All changes to VFS are logged, and can be retrieved via
35://! [`set_file_contents`]: Vfs::set_file_contents
155:    pub fn set_file_contents(&mut self, path: VfsPath, contents: Option<Vec<u8>>) -> bool {

crates/rust-analyzer/src/reload.rs
250:                    vfs.set_file_contents(vfs_path.clone(), contents);

crates/rust-analyzer/src/main_loop.rs
275:                                    vfs.set_file_contents(path, contents);
536:                        .set_file_contents(path, Some(params.text_document.text.into_bytes()));
568:                    vfs.set_file_contents(path.clone(), Some(text.into_bytes()));

crates/rust-analyzer/src/cli/load_cargo.rs
45:        vfs.set_file_contents(path.clone(), contents);
77:                    vfs.set_file_contents(path.into(), contents);
Joshua Nelson (Jan 18 2021 at 23:02, on Zulip):

so my idea is that instead of calling set_file_contents immediately, I add a new set_file_state which only stores whether the file exists or not

Joshua Nelson (Jan 18 2021 at 23:02, on Zulip):

and then get and get_mut load it lazily

Joshua Nelson (Jan 18 2021 at 23:02, on Zulip):

that does require get to take &mut self though

Joshua Nelson (Jan 18 2021 at 23:02, on Zulip):

or to do RefCell dances

Joshua Nelson (Jan 18 2021 at 23:17, on Zulip):

hmm, running vfs.iter() requires all the files to be loaded eagerly so RA can tell whether they exist or not

Joshua Nelson (Jan 18 2021 at 23:17, on Zulip):

why does .iter() need to skip deleted files? I see it's only used by FileSetConfig::partition

Joshua Nelson (Jan 18 2021 at 23:21, on Zulip):

wow your test suite runs fast

Joshua Nelson (Jan 18 2021 at 23:21, on Zulip):

this is super impressive

Joshua Nelson (Jan 18 2021 at 23:23, on Zulip):

wish me luck :laughing:

diff --git a/crates/vfs/src/lib.rs b/crates/vfs/src/lib.rs
index f84b9a7b5..d85288780 100644
--- a/crates/vfs/src/lib.rs
+++ b/crates/vfs/src/lib.rs
@@ -153,7 +153,7 @@ impl Vfs {
     pub fn iter(&self) -> impl Iterator<Item = (FileId, &VfsPath)> + '_ {
         (0..self.data.len())
             .map(|it| FileId(it as u32))
-            .filter(move |&file_id| self.get(file_id).exists())
+            //.filter(move |&file_id| self.get(file_id).exists())
             .map(move |file_id| {
                 let path = self.interner.lookup(file_id);
                 (file_id, path)
Joshua Nelson (Jan 18 2021 at 23:25, on Zulip):

oh hmm it actually passes the test suite with that change ??

Joshua Nelson (Jan 18 2021 at 23:25, on Zulip):

now I'm even more confused

Joshua Nelson (Jan 18 2021 at 23:30, on Zulip):

ok so I have another question: file_id also filters for deleted files. But files can be deleted at any time, right? So doesn't the caller still need to handle the case where the file was deleted after calling file_id?

Joshua Nelson (Jan 18 2021 at 23:30, on Zulip):

/me goes to delete that call too

Joshua Nelson (Jan 18 2021 at 23:32, on Zulip):

/me is worried that there are just no tests for deleted files

Joshua Nelson (Jan 18 2021 at 23:32, on Zulip):

> fd test | xargs rg delete seems to support this conclusion :(

Lukas Wirth (Jan 18 2021 at 23:38, on Zulip):

I'm not too sure on this but I believe that file-related testing isn't too prevalent in the codebase

Lukas Wirth (Jan 18 2021 at 23:38, on Zulip):

At least the VFS crate doesn't seem to have a lot of tests :sweat_smile:

Joshua Nelson (Jan 18 2021 at 23:39, on Zulip):

does RA have integration tests? or just unit tests?

Lukas Wirth (Jan 18 2021 at 23:44, on Zulip):

I'm not sure tbh, not that I can immediately think of. But I also haven't peeked at all the crates properly yet so far.

Lukas Wirth (Jan 18 2021 at 23:44, on Zulip):

Ah there seem to be a few in the main rust-analyzer crate(in crates\rust-analyzer\tests\rust-analyzer)

Joshua Nelson (Jan 18 2021 at 23:57, on Zulip):

ok and the last obstacle is somehow I need to get a loader into Vfs

Joshua Nelson (Jan 18 2021 at 23:58, on Zulip):

my first idea was to move it from GlobalState to the Vfs struct, and make it public so it's still accessible, but that breaks because Handle is only in the top-level rust-analyzer crate:

error[E0412]: cannot find type `Handle` in this scope
  --> crates/vfs/src/lib.rs:69:17
   |
69 |     pub loader: Handle<Box<dyn loader::Handle>, Receiver<loader::Message>>,
   |                 ^^^^^^ not found in this scope

(loader::Handle is something different)

Joshua Nelson (Jan 18 2021 at 23:59, on Zulip):

hmm I guess I don't need the whole loader, just the handle

Joshua Nelson (Jan 19 2021 at 00:06, on Zulip):

oh boooo the borrow checker doesn't understand the borrow is released if I return early

error[E0499]: cannot borrow `*self` as mutable more than once at a time
   --> crates/vfs/src/lib.rs:240:13
    |
233 |     fn get_or_load(&mut self, file_id: FileId) -> &mut FileContents {
    |                    - let's call the lifetime of this reference `'1`
234 |         if let Some(contents) = &mut self.data[file_id.0 as usize] {
    |                                      --------- first mutable borrow occurs here
235 |             return contents;
    |                    -------- returning this value requires that `self.data` is borrowed for `'1`
...
240 |             self.set_id_contents(file_id, contents);
    |             ^^^^ second mutable borrow occurs here
Lukas Wirth (Jan 19 2021 at 00:08, on Zulip):

polonius when will you save us

Lukas Wirth (Jan 19 2021 at 00:09, on Zulip):

Iirc there is a comment somewhere in RA that annotates a double hashmap index cause of the same borrowck problem :big_smile:

Joshua Nelson (Jan 19 2021 at 00:10, on Zulip):

well the difference is this has to do computation in the middle :/ so it can't hold the same borrow the whole time

Joshua Nelson (Jan 19 2021 at 00:10, on Zulip):

(the computation being set_id_contents)

Joshua Nelson (Jan 19 2021 at 00:10, on Zulip):

wtf why does this work in playground https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=08b05ad9f9d41d93d75b57b1285f4309

Joshua Nelson (Jan 19 2021 at 00:12, on Zulip):

oh apparently the borrow checker only runs if you don't have unreachable code

Joshua Nelson (Jan 19 2021 at 00:12, on Zulip):

that seems silly

Joshua Nelson (Jan 19 2021 at 00:12, on Zulip):

anyway, mcve: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=4dafaf60879fd4ad9886cf56010b51f2

Lukas Wirth (Jan 19 2021 at 00:17, on Zulip):

I don't think you can get around doing a check first whether its none and then having to index a second time :sad:

Joshua Nelson (Jan 19 2021 at 00:17, on Zulip):

oh well that's not bad

Joshua Nelson (Jan 19 2021 at 00:54, on Zulip):

well this turned into an absolute mess but uhhh it does compile

Joshua Nelson (Jan 19 2021 at 02:02, on Zulip):

https://github.com/rust-analyzer/rust-analyzer/pull/7344

matklad (Jan 19 2021 at 08:41, on Zulip):

I don't think we should do lazy loading, that'd be XY problem

matklad (Jan 19 2021 at 08:43, on Zulip):

(we need to add lazy loading some day, but not today, as this won't help this issue --- we'll just lazy-load files when indexing)

matklad (Jan 19 2021 at 08:45, on Zulip):

Instead, we should add files_excludeDirs: Vec<PathBuf> = [], in config.rs

Joshua Nelson (Jan 19 2021 at 14:36, on Zulip):

@matklad right but the exclude info is not available from cargo metadata. It only gives the packages themselves, not the info it used to decide on the packages

matklad (Jan 19 2021 at 14:37, on Zulip):

We need to support exclude in settings.json anyway

matklad (Jan 19 2021 at 14:38, on Zulip):

so we might as well do this now as the simplest thing for the problem at hand

Joshua Nelson (Jan 19 2021 at 14:38, on Zulip):

Hmm, ok. I'd rather configure it twice than not be able to use rust-analyzer

Jonas Schievink [he/him] (Jan 26 2021 at 19:41, on Zulip):

This should help :tada: https://github.com/rust-analyzer/rust-analyzer/pull/7451

Chris B (May 24 2021 at 18:42, on Zulip):

Has there been any movement on getting excludeDirs to work? The github issue seems to be open still, and I haven't been able to figure it out. I tried changing this to explicitly add global_excludes to dirs.exclude, but then I just get a wall of file with a cargo diagnostic was not in vfs errors which takes just as long to process, so I guess flycheck is still looking for the directory I'm trying to ignore.

matklad (May 24 2021 at 18:57, on Zulip):

No, no progerss here. I think it should be relatively striaght forward to fix for someone unfamiliar with the code. Let me add some pointers.

matklad (May 24 2021 at 19:02, on Zulip):

https://github.com/rust-analyzer/rust-analyzer/issues/7755#issuecomment-847266325

Chris B (May 24 2021 at 20:51, on Zulip):

Thanks. I got the first half working, but the FileSystemWatcher thing is specified by glob patterns that apparently won't ever support negation. In the cases where the glob pattern encompasses the directories the user wants to ignore, are you okay with walking the package root and sending glob patterns that work around the one to be ignored?

matklad (May 25 2021 at 09:12, on Zulip):

We first need to figure out if FIleSystemWatcher is relevant at all -- I think that it should obey VS Code's exclude out of the box, so we don't need any changes there

matklad (May 25 2021 at 09:13, on Zulip):

Basically, we need to test, if we set-up files.exclude and rust-analyzer.exclude, why are we not ignoring files? Is this because VFS does traversal? Or is it because watcher sends notifications to us?

Chris B (May 25 2021 at 13:31, on Zulip):

In my case the main issue was here; if you want to exclude a subdirectory of an included directory, the current logic won't catch it. It should be something like:

    if dirs.include.iter().any(|incl| incl.starts_with(excl) || excl.starts_with(incl)) {

The other thing was from Flycheck; vscode makes it look like the default checkOnSave command is just check, but it actually includes all-targets. I couldn't understand why RA was picking up a mod annotated with #[cfg(test)] just using cargo check, but that explains it. Users might find it helpful if the default in package.json was something that indicated "your command here" since the actual one invoked includes the manfiest and is project-specific.

matklad (May 25 2021 at 14:36, on Zulip):

Hm, yeah, the first ones sounds like a bug! Could you send a PR?

Chris B (May 25 2021 at 15:17, on Zulip):

Sure thing. https://github.com/rust-analyzer/rust-analyzer/pull/8994

Last update: Jul 24 2021 at 20:15UTC