Stream: t-compiler/wg-rls-2.0

Topic: rust-analyzer#734 test_vfs_works on OS X


pnkfelix (Feb 12 2019 at 11:25, on Zulip):

Hey @matklad , do you know offhand if rust-analyzer has known issues on OS X?

pnkfelix (Feb 12 2019 at 11:26, on Zulip):

I'm doing cargo test and am hitting Timeout errors during test_vfs_works, which may or may not be related to rust-analyzer#734

matklad (Feb 12 2019 at 11:26, on Zulip):

I don't know, but this seems unlikely. That is, I'd say its "90% works out of the box", but some weired shenanigans are not excluded.

matklad (Feb 12 2019 at 11:26, on Zulip):

aaaa

matklad (Feb 12 2019 at 11:26, on Zulip):

that is a known issue

pnkfelix (Feb 12 2019 at 11:27, on Zulip):

So I shouldn't bother with cargo test for now?

matklad (Feb 12 2019 at 11:27, on Zulip):

https://github.com/rust-analyzer/rust-analyzer/issues/734

matklad (Feb 12 2019 at 11:27, on Zulip):

Well, it's best to fix the issue, or to disable these tests on macos

pnkfelix (Feb 12 2019 at 11:27, on Zulip):

I will note that my stack trace points to a different line number, I think...

pnkfelix (Feb 12 2019 at 11:27, on Zulip):

Yeah but I'm not even 100% sure what the issue is right now

pnkfelix (Feb 12 2019 at 11:28, on Zulip):

they say that they are getting to vfs.rs:82

pnkfelix (Feb 12 2019 at 11:28, on Zulip):

I'm getting it on vfs.rs:71

pnkfelix (Feb 12 2019 at 11:28, on Zulip):

(but maybe that is an artifact of changes to the code since that bug was filed.)

matklad (Feb 12 2019 at 11:29, on Zulip):

this test is about watching file system for changes. We hit many curious issues in this subsystem and upstreamed a couple of fixed to the notify library we use

pnkfelix (Feb 12 2019 at 11:29, on Zulip):

in particular I am hitting it on https://github.com/rust-analyzer/rust-analyzer/blob/master/crates/ra_vfs/tests/vfs.rs#L71

davidtwco (Feb 12 2019 at 11:30, on Zulip):

Reported line in repo at time of issue filing: https://github.com/rust-analyzer/rust-analyzer/blob/55408081b05f79947f7ce4692b31be584aef4657/crates/ra_vfs/tests/vfs.rs#L82

pnkfelix (Feb 12 2019 at 11:31, on Zulip):

ah okay so it is the same line then

matklad (Feb 12 2019 at 11:31, on Zulip):

Two thoughs:

pnkfelix (Feb 12 2019 at 11:32, on Zulip):

I increased timeout to 10 seconds earlier. Didn't seem to help, so I didn't try larger values. But I can give it a shot

matklad (Feb 12 2019 at 11:32, on Zulip):

My hypothesis is (from a similar bug in Cargo) that MacOS misses modifications to a file if they withing the same seconds

pnkfelix (Feb 12 2019 at 11:32, on Zulip):

though maybe I'll try adding the sleep call first

matklad (Feb 12 2019 at 11:32, on Zulip):

the sleep call before write could help with that I think

pnkfelix (Feb 12 2019 at 11:33, on Zulip):

My hypothesis is (from a similar bug in Cargo) that MacOS misses modifications to a file if they withing the same seconds

Hmm interesting. I recall various timestamp issues when I worked on tango; I guess I had hoped that a notification system wouldn't fall victim to similar issues...

pnkfelix (Feb 12 2019 at 11:35, on Zulip):

grr sleep did not fix things either. Hmm.

matklad (Feb 12 2019 at 11:35, on Zulip):

yeah, I wouldn't call our file watching service reliable, it has issues.... Though luckily its not that important: most change notifications go via LSP, which works great. VFS watching is for cases like git checkokut other-branch.

matklad (Feb 12 2019 at 11:36, on Zulip):

Given ^, I am fine with #[cfg_attr(macos, ignore)] // #734

pnkfelix (Feb 12 2019 at 11:37, on Zulip):

(This is of course a schrodinger's-cat style bug where if I add instrumentation to try to learn more about the actual timestamps I might disrupt the replication of the bug itself...)

matklad (Feb 12 2019 at 11:37, on Zulip):

well, if adding timestamps fixes the test....

pnkfelix (Feb 12 2019 at 11:38, on Zulip):

I'll poke at this a tiny bit more before resorting to cfg-ignoring it

matklad (Feb 12 2019 at 11:39, on Zulip):

It might be useful to stick dbg!(event) here: https://github.com/rust-analyzer/rust-analyzer/blob/master/crates/ra_vfs/src/io.rs#L151

matklad (Feb 12 2019 at 11:39, on Zulip):

that would be even we get from the library.

pnkfelix (Feb 12 2019 at 11:39, on Zulip):

Isn't that going to emit output to stdout/stderr? That's the sort of thing I was assuming would disrupt the bug

pnkfelix (Feb 12 2019 at 11:40, on Zulip):

I was thinking i might emit data to a separate (in-memory) log that could be printed after the bug was observed

pnkfelix (Feb 12 2019 at 11:40, on Zulip):

but I guess it doesn't hurt to try adding the dbg! call and see what happens, if the bug continues to reproduce then no harm done

matklad (Feb 12 2019 at 11:41, on Zulip):

If sleeps don't disrupt the bug, there's a change that dbg won't as well, yeah

pnkfelix (Feb 12 2019 at 11:42, on Zulip):

(look at that, it does indeed continue to reproduce. great!)

pnkfelix (Feb 12 2019 at 11:42, on Zulip):

/me is going to run and get lunch before its all gone. I'll be back in a 10 or 15 minuts

matklad (Feb 12 2019 at 11:46, on Zulip):

https://github.com/passcod/notify/issues/150 seems vaguely related

matklad (Feb 12 2019 at 11:57, on Zulip):

This means that if you open a file, write to it, close the file, and repeat, you will receive a single event _unless_ there is some delay between the two opens; in very simple testing this delay seems to be about 500ms.

From https://github.com/passcod/notify/issues/147

pnkfelix (Feb 12 2019 at 12:18, on Zulip):

From that same issue:

"What's your high-level goal here? If you're just interested in a single file, James's suggestion would work. If you're trying to monitor an entire hierarchy of files, you're going to have to rethink you're assumptions. FSEvents is designed to notify you about events in the file system so that you can then go look in the file system to find the current state of things and sync based on that. It's not designed to feed you a stream of events with sufficient fidelity to reconstruct how the file system go into its current state."

https://lists.apple.com/archives/filesystem-dev/2016/Mar/msg00004.html

matklad (Feb 12 2019 at 12:21, on Zulip):

yeah, that's roughly what we are doing: if we receive notification, a separate thread goes and reads the file from disk

pnkfelix (Feb 12 2019 at 12:21, on Zulip):

Hmm, interesting: Adding the sleep_ms call does seem to cause two events to be observed, rather than just one (for the Create(.../a/b/baz.rs), that is). But regardless of whether the two events have been folded into one or not, I still get the timeout.

pnkfelix (Feb 12 2019 at 12:22, on Zulip):

Oh wait, I should be seeing the Write event, and I'm not. (or let me double-check that)

pnkfelix (Feb 12 2019 at 12:23, on Zulip):

hmm. Is the fs::write call going to cause a Create event to be emitted here?

matklad (Feb 12 2019 at 12:24, on Zulip):

it shouldn't cause a create

matklad (Feb 12 2019 at 12:24, on Zulip):

file already exists

pnkfelix (Feb 12 2019 at 12:25, on Zulip):

Yeah that's what I thought too, but I'm definitely seeing a Create (and only a Create) after the long pause injected by the sleep_ms

matklad (Feb 12 2019 at 12:25, on Zulip):

and this is a write(2), not an atomic editors create temp file & rename

matklad (Feb 12 2019 at 12:26, on Zulip):

aha! That explains the bug

pnkfelix (Feb 12 2019 at 12:26, on Zulip):

what? Is the file being auto-deleted or something in the interim?

matklad (Feb 12 2019 at 12:27, on Zulip):

if we get Create instead of Change, we don't emit a change because file already exists in the vfs

pnkfelix (Feb 12 2019 at 12:27, on Zulip):

ah. and so we don't see any VfsChange and so the task times out

pnkfelix (Feb 12 2019 at 12:27, on Zulip):

So the next Q might be: WHy are we seeing a Change here

pnkfelix (Feb 12 2019 at 12:28, on Zulip):

I'm going to rename this topic

matklad (Feb 12 2019 at 12:28, on Zulip):

why we get Create in the first place is beyond me, but I guess that is just "operating systems are diffrent"

matklad (Feb 12 2019 at 12:28, on Zulip):

I think "why" should not matter: I think the high-level bug is that we differentiate between Crate/Change at all

matklad (Feb 12 2019 at 12:29, on Zulip):

Given that this all vfs thing is supposed to be eventually (or quiescently) consistent, we should rely on telling if the file is created or changed.

matklad (Feb 12 2019 at 12:29, on Zulip):

s/rely/not rely/

pnkfelix (Feb 12 2019 at 12:30, on Zulip):

(I'm going to see if I can adjust Zulip's hyperlinking for our stream.)

matklad (Feb 12 2019 at 12:30, on Zulip):

will try to write a fix shortly!

pnkfelix (Feb 12 2019 at 12:32, on Zulip):

(I'm going to see if I can adjust Zulip's hyperlinking for our stream.)

((did someone fix it in parallel with me, or did I just not notice two hyperlinks before...?))

matklad (Feb 12 2019 at 12:32, on Zulip):

/me touched nothing

pnkfelix (Feb 12 2019 at 12:33, on Zulip):

Ah okay I think @davidtwco assisted

pnkfelix (Feb 12 2019 at 12:33, on Zulip):

You'll see now that if you write rust-analyzer#<number> in a topic or in a chat message, the linkifer will operate on it.

pnkfelix (Feb 12 2019 at 12:33, on Zulip):

e.g. rust-analyzer#734

pnkfelix (Feb 12 2019 at 12:34, on Zulip):

we have found this to be pretty useful way to structure topics

pnkfelix (Feb 12 2019 at 12:35, on Zulip):

though it is a bit funny that there are now two arrows (one pointing to rust-analyzer/rust-analyzer#734 and another to rust-lang/rust#734); unfortunately the Zulip's OS X app API does not give any tooltip or other preview of which arrow is which...

matklad (Feb 12 2019 at 13:02, on Zulip):

filed https://github.com/rust-analyzer/rust-analyzer/pull/805 We might need the delay as well though

davidtwco (Feb 12 2019 at 13:02, on Zulip):

(linkifier will work for PRs too: rust-analyzer#805)

matklad (Feb 12 2019 at 13:03, on Zulip):

Can it work in the reverse direction? It's easier to copy URL from the address bar than to type in the pr number :)

davidtwco (Feb 12 2019 at 13:05, on Zulip):

Not that I'm aware of.

pnkfelix (Feb 12 2019 at 13:07, on Zulip):

I don't think its a breach of ettiquette to post urls if that better fits your workflow. (Well, of course its nicer for the topic for it to use the shorter linkified version)

pnkfelix (Feb 12 2019 at 13:07, on Zulip):

The main benefit of linkification is when one is typing on the fly, IMO.

davidtwco (Feb 12 2019 at 13:07, on Zulip):

You can use standard Markdown linking syntax, so #805.

davidtwco (Feb 12 2019 at 13:08, on Zulip):

You can use standard Markdown linking syntax, so [#805](https://github.com/rust-analyzer/rust-analyzer/pull/805).

pnkfelix (Feb 12 2019 at 13:08, on Zulip):

Oh yeah, that might be the easiest way, when writing comments in the channel. But does that syntax work in topics?

pnkfelix (Feb 12 2019 at 13:08, on Zulip):

(for some reason I thought it didn't)

davidtwco (Feb 12 2019 at 13:08, on Zulip):

Not sure. I'd just use a linkifier in topics.

pnkfelix (Feb 12 2019 at 13:09, on Zulip):

yeah it doesn't work (I just tested; it renders as the raw markdown)

matklad (Feb 12 2019 at 13:24, on Zulip):

circling back to the original topic, looks like rust-analyzer#805 didn't actually fix the issue, even with added timeouts (the ad-hoc mac build in rust-analyzer#803 fails) :(

pnkfelix (Feb 12 2019 at 13:25, on Zulip):

yeah I'm looking into it

pnkfelix (Feb 12 2019 at 15:50, on Zulip):

okay it looks to me like io::handle_change is falling into the None branch here a lot.

pnkfelix (Feb 12 2019 at 15:50, on Zulip):

ir

pnkfelix (Feb 12 2019 at 15:51, on Zulip):

or rather, its falling into that branch during the lookup of .../a/b/baz.rs (where ... is the tempdir)

pnkfelix (Feb 12 2019 at 15:51, on Zulip):

oh, that makes sense for a Create call.

pnkfelix (Feb 12 2019 at 15:52, on Zulip):

(or maybe it doesn't)

matklad (Feb 12 2019 at 15:52, on Zulip):

find should return Some even if file does not exist

pnkfelix (Feb 12 2019 at 15:53, on Zulip):

yeah it seems like the code there is assuming that find will give you the root even if the file itself hasn't been created

matklad (Feb 12 2019 at 15:53, on Zulip):

It doesn't do syscalls and works just on path strings

pnkfelix (Feb 12 2019 at 15:53, on Zulip):

but does that match the implementation of find ?

matklad (Feb 12 2019 at 15:53, on Zulip):

Could this be someting about case-insensitive file system?

matklad (Feb 12 2019 at 15:54, on Zulip):

contains is this

matklad (Feb 12 2019 at 15:54, on Zulip):

it checks for prefix

pnkfelix (Feb 12 2019 at 15:54, on Zulip):

hmm okay I'll look deeper

matklad (Feb 12 2019 at 15:55, on Zulip):

There's a single place in the test where we expect None:

    // should be ignored
    fs::create_dir_all(dir.path().join("a/target")).unwrap();
    fs::write(&dir.path().join("a/target/new.rs"), "ignore me").unwrap();
matklad (Feb 12 2019 at 15:56, on Zulip):

everything else should return Some

pnkfelix (Feb 12 2019 at 16:01, on Zulip):

This strip_prefix call is returning Err(StripPrefixError) for "/private/var/folders/68/bxt927zn2tnbpj7qf4_4vpk80000gn/T/.tmphSepLW/a/b/baz.rs"

matklad (Feb 12 2019 at 16:03, on Zulip):

SepLW <- I bet that's case sensitivity

pnkfelix (Feb 12 2019 at 16:04, on Zulip):

more detail:
path.strip_prefix(&self.root) is returning that error for

path:  "/private/var/folders/68/bxt927zn2tnbpj7qf4_4vpk80000gn/T/.tmpWuoGmP/a/b/baz.rs",
self.root:  "/var/folders/68/bxt927zn2tnbpj7qf4_4vpk80000gn/T/.tmpWuoGmP/a"
matklad (Feb 12 2019 at 16:04, on Zulip):

so, the first component is different...

pnkfelix (Feb 12 2019 at 16:04, on Zulip):

I think its the /private there, yeah

pnkfelix (Feb 12 2019 at 16:07, on Zulip):

On "OS X", /var is symlink to /private/var

pnkfelix (Feb 12 2019 at 16:08, on Zulip):

Does VFS attempt to normalize symbolic links in its internal paths? If not, should it?

matklad (Feb 12 2019 at 16:08, on Zulip):

No, it doesn't

matklad (Feb 12 2019 at 16:08, on Zulip):

I think it should, though I am not 100% sure

pnkfelix (Feb 12 2019 at 16:09, on Zulip):

because under the assumption that this notification is constructed by the notification library, I'm not sure how we otherwise get around ths

pnkfelix (Feb 12 2019 at 16:09, on Zulip):

other than by special casing

matklad (Feb 12 2019 at 16:09, on Zulip):

Ie, the notify library definitely returns canonical paths, but I am not sure if language client returns canonical paths...

pnkfelix (Feb 12 2019 at 16:10, on Zulip):

(i.e. instead of trying to normalize symlinks everywhere, we could just recognize that this behavior can arise for the tempdir results and just normalize their path components)

Jeremy Kolb (Feb 12 2019 at 16:10, on Zulip):

Maybe but windows FAT32/NTFS is also case-insensitive and windows doesn't seem to share the same problem.

matklad (Feb 12 2019 at 16:10, on Zulip):

yeah, I was wrong, it's not case sensitity after all

matklad (Feb 12 2019 at 16:12, on Zulip):

The safest thing to do would be to change RootConfig to

RootConfig {
    root: PathBuf,
    // result of root.canonicalize() if it is different from root
    canonical_root: Option<PathBuf>
}

and check both roots in contains

pnkfelix (Feb 12 2019 at 16:13, on Zulip):

Okay I'll look into that.

matklad (Feb 12 2019 at 16:22, on Zulip):

I've looked at the LSP speck, and how paths should be handled is basically unspecified. The relevant issue is #669.

matklad (Feb 12 2019 at 16:22, on Zulip):

@pnkfelix given that we've only hit this in tests, perhaps we can use a less general solution? Let's call .canonicalize() in the test itself?

pnkfelix (Feb 12 2019 at 16:23, on Zulip):

urrr

pnkfelix (Feb 12 2019 at 16:23, on Zulip):

I'm skeptical of that

pnkfelix (Feb 12 2019 at 16:23, on Zulip):

well, I suppose it depends

matklad (Feb 12 2019 at 16:23, on Zulip):

I think that the setup with two roots is probably right long term, but I don't feel too easy about makeing VFS code more complcated.

pnkfelix (Feb 12 2019 at 16:23, on Zulip):

your thinking is that the use of tempdir is limited to tests alone?

matklad (Feb 12 2019 at 16:24, on Zulip):

I was thinking about that yeah, but now I understand that I am wrong ;-(

matklad (Feb 12 2019 at 16:24, on Zulip):

User totally can have a project in the symlinked directory

pnkfelix (Feb 12 2019 at 16:24, on Zulip):

OMG i am late to pick up myso

pnkfelix (Feb 12 2019 at 16:24, on Zulip):

my son

pnkfelix (Feb 12 2019 at 16:24, on Zulip):

bye!

matklad (Feb 12 2019 at 16:24, on Zulip):

bye!

pnkfelix (Feb 13 2019 at 21:16, on Zulip):

I don't understand. My recollection of my day today is that I did many runs and each time I would search the output log for the text new.rs and always came up empty

pnkfelix (Feb 13 2019 at 21:17, on Zulip):

and yet somehow I posted a gist that clearly had a rename of new.rs in the output

pnkfelix (Feb 13 2019 at 21:17, on Zulip):

oh wait, no, I added that line

pnkfelix (Feb 13 2019 at 21:17, on Zulip):

(instrumented my own fn fs_rename function that i added)

matklad (Feb 13 2019 at 21:18, on Zulip):

I actually wonder if we should spend more time chasing this bug at this time? There's an upstream issue about adding an easier and cross platform API:https://github.com/passcod/notify/issues/175

matklad (Feb 13 2019 at 21:19, on Zulip):

If it is implemented (and https://github.com/vemoo/ is working on it) , than presumably this will become the problem of the library.

matklad (Feb 13 2019 at 21:19, on Zulip):

That is, I somewhat blame the API here, which looks like it exposes platform differences :)

pnkfelix (Feb 13 2019 at 21:20, on Zulip):

well even vemoo said they were surprised by the results here on this bug

pnkfelix (Feb 13 2019 at 21:22, on Zulip):

but your point is that you think we might manage to actually build upon something that exhibits consistent behavior on all platforms? It would be nice.

matklad (Feb 13 2019 at 21:22, on Zulip):

Yeah, that's the problem: we are observing platform differences which arguably should be handled by the upstream library.

pnkfelix (Feb 13 2019 at 21:23, on Zulip):

I can agree that it may not be a great idea to try to fix this problem 100% right now

pnkfelix (Feb 13 2019 at 21:23, on Zulip):

and maybe better to just #[cfg]-out this (or all) vfs tests on OS X for now

matklad (Feb 13 2019 at 21:24, on Zulip):

Like, the vfs/io module is waaay to complicated, and the best long-term strategy would be to move a bulk of it to a completely general purpose library. So I suggest watchicng how notify#175 develops for the time being. There's also a planned rewrite of the notify library...

matklad (Feb 13 2019 at 21:24, on Zulip):

But it's good that we've fixed that canonicalize bug! That's defenitelly was a problem

pnkfelix (Feb 13 2019 at 21:24, on Zulip):

still I want to at least get far enough here to see if there is a notify bug lurking here

matklad (Feb 13 2019 at 21:26, on Zulip):

Good! But don't feel bad about cfg-ing out this part of the test on MacOs if you get to far in the weeds!

matklad (Feb 13 2019 at 21:26, on Zulip):

also, could you rustfmthttps://github.com/rust-analyzer/rust-analyzer/pull/818?

pnkfelix (Feb 13 2019 at 21:28, on Zulip):

oh yeah let me take care of that

pnkfelix (Feb 13 2019 at 21:33, on Zulip):

What do you think about trying to add explicit .sync_data() or .sync_all() calls to the VFS tests, though?

pnkfelix (Feb 13 2019 at 21:33, on Zulip):

Do you think that shouldn't be necessary? It seems more robust to me than just adding delays (i.e. calls to sleep())...

pnkfelix (Feb 13 2019 at 21:34, on Zulip):

(assuming one either puts them into the right places, or consistently uses a helper that always syncs, like the one I added in the gist I posted)

matklad (Feb 13 2019 at 21:35, on Zulip):

I would think that neither delays nor syncs should ideally be necessary.

I think the test currently actually is written in a pretty good way, in that we always try to pump all of the events after each modification.

That is, the test should work because of causality, not because of the good timing.

pnkfelix (Feb 13 2019 at 21:36, on Zulip):

so an fs::write(...) invocation should cause the notify library to fire, regardless of whether the write has been pushed out to the disk?

pnkfelix (Feb 13 2019 at 21:37, on Zulip):

or your point is that the process_tasks call should be waiting long enough for the event to come through?

matklad (Feb 13 2019 at 21:38, on Zulip):

More precisely, fs::write should trigger notify withing three seconds after the file descriptor is closed

matklad (Feb 13 2019 at 21:38, on Zulip):

Like, it's not unreasonable to assume that file systems work this way, but I might be proven wrong here

matklad (Feb 13 2019 at 21:40, on Zulip):

or your point is that the process_tasks call should be waiting long enough for the event to come through?

basically, yes. The test guarantees that we can't loose events (modulo bugs), and I hope that eventually we see all of the events, where "eventually" is less than 3 seconds

pnkfelix (Feb 13 2019 at 21:40, on Zulip):

I can understand in principle. I guess I just wish I wasn't seeing behavior vary depending on whether I include the sync calls or not. But I should be taking more precise notes on which of the changes I am making here actually matter, and in what way they matter.

pnkfelix (Feb 13 2019 at 21:42, on Zulip):

Hmm this also seems potentially relevant: http://wingolog.org/archives/2018/05/21/correct-or-inotify-pick-one

matklad (Feb 13 2019 at 21:42, on Zulip):

yeah... I think you had some variability in exact raw events you have been observing? That should be OK I think: it is allowed to miss some of raw events as long as the last one is registered

pnkfelix (Feb 13 2019 at 21:44, on Zulip):

Hmm this also seems potentially relevant: http://wingolog.org/archives/2018/05/21/correct-or-inotify-pick-one

(or maybe there's nothing there you don't already know)

matklad (Feb 13 2019 at 21:46, on Zulip):

interesting read! Together with that MacOS docs I think it implies that getting precise list of events is impossible, and useless, due to TOCTTOU.

We however don't need a precise list: we need to know if things "could have changed" to be correct (that whole comment about quiescent consistency on TastResult). I think that should be achieavable.

matklad (Feb 13 2019 at 21:47, on Zulip):

well, better to say I hope :-)

pnkfelix (Feb 13 2019 at 21:51, on Zulip):

this question may be pointless, given your earlier statement (that we'd be better off switching to a different/better cross-platform API), but I figure I'll ask it anyway: The vfs test architecture, if I understand correctly, seems to be built on the assumption that you can predict the exact number of events you should always receive in response to the operations that are performed in between each call to process_tasks, is that right?

pnkfelix (Feb 13 2019 at 21:52, on Zulip):

this seems fundamentally at odds with the idea that sometimes events can be collapsed or missed

pnkfelix (Feb 13 2019 at 21:53, on Zulip):

I don't have any immediate suggestions for a better test architecture, other than maybe allowing for a subset subsequence of the events to be observed and still consider it a passing test, depending on which subsequence you observe, that is.

Laurențiu Nicola (Feb 13 2019 at 22:00, on Zulip):

I think it implies that getting precise list of events is impossible, and useless, due to TOCTTOU

on windows too, if the OS buffer overflows; you're supposed to re-read your directory and start over, but after a cursory glance I'm not sure that the notify crate handles that

matklad (Feb 13 2019 at 22:03, on Zulip):

There's a RESCAN event for that I think. As long as we know that we should rescan, I think we are fine.

Laurențiu Nicola (Feb 13 2019 at 22:06, on Zulip):

/// At the moment RESCAN events aren't emitted on Windows.

Laurențiu Nicola (Feb 13 2019 at 22:07, on Zulip):

Explorer has a "Refresh" right-click menu option, maybe we should have one

pnkfelix (Feb 14 2019 at 12:09, on Zulip):

Okay I did some digging and I think that one way out of this in the short term could be to use the raw events rather than the debounced ones

pnkfelix (Feb 14 2019 at 12:10, on Zulip):

I put up an issue at https://github.com/passcod/notify/issues/181

pnkfelix (Feb 14 2019 at 12:11, on Zulip):

the most bizarre thing is that in the common case, a rename of "foo" to "bar" on Mac OS X yields raw events that say that foo op was CREATE|RENAME|WRITE and bar op was REMOVE|RENAME. (Which strikes me as 100% backwards.) I don't know whether this is an OS X issue or some logic gone backwards in notify itself.

matklad (Feb 14 2019 at 12:17, on Zulip):

Interesting!

Conceptually, I think like we should use debounced events

pnkfelix (Feb 14 2019 at 12:18, on Zulip):

at this point I'm also wondering if part of the issue here is that notify is, AFAICT, still using the FSEvent backend

pnkfelix (Feb 14 2019 at 12:18, on Zulip):

despite saying that it would switch to kqueue?

pnkfelix (Feb 14 2019 at 12:18, on Zulip):

see here, namely: https://github.com/passcod/notify/pull/144#issuecomment-427560683

pnkfelix (Feb 14 2019 at 12:19, on Zulip):

maybe they mean in v5 it will use kqueue

matklad (Feb 14 2019 at 12:19, on Zulip):

Yeah, I think this is about v5

matklad (Feb 14 2019 at 12:19, on Zulip):

which now seems more remote than a year ago

pnkfelix (Feb 14 2019 at 12:20, on Zulip):

in any case at this point I am super curious if the problem is just someone swapped to/from somewhere in the notify/fsevent.rs code

matklad (Feb 14 2019 at 12:20, on Zulip):

yeah, I did a quick look at it, and it seems like a kind of code where it's easy to make subtle logic errors

pnkfelix (Feb 14 2019 at 12:21, on Zulip):

I'm going to do a quick check there, I'm already this deep into this

matklad (Feb 14 2019 at 12:24, on Zulip):

:+1: long term, we will need a reliable file watching service anyway, so it is a well-spent time

pnkfelix (Feb 14 2019 at 12:30, on Zulip):

hmm all of notify's fsevents.rs test suite is cfg'ed-out under a feature = "timing_tests". But several of them look like tests of basic functionality...

pnkfelix (Feb 14 2019 at 12:31, on Zulip):

(and of course they've all bitrotted)

pnkfelix (Feb 14 2019 at 12:32, on Zulip):

This comment in the notify.rs test suite may be relevant though:

    // macOS FsEvent needs some time to discard old events from its log.
    sleep_macos(10);
pnkfelix (Feb 14 2019 at 12:33, on Zulip):

(but that's only 10ms; I thought it might represent seconds but it doesnt.)

nagisa (Feb 14 2019 at 12:44, on Zulip):

the most bizarre thing is that in the common case, a rename of "foo" to "bar" on Mac OS X yields raw events that say that foo op was CREATE|RENAME|WRITE and bar op was REMOVE|RENAME. (Which strikes me as 100% backwards.) I don't know whether this is an OS X issue or some logic gone backwards in notify itself.

This is something that is prevalent and can happen on any OS. This is because the APIs report actual syscalls if not information from block level drivers. Then, applications may implement their moves however they want.

nagisa (Feb 14 2019 at 12:45, on Zulip):

i.e. "create -> copy -> delete old" is a very commonplace approach to implememnt file movement.

nagisa (Feb 14 2019 at 12:46, on Zulip):

in order to derive any sort of semantic information and to figure out what has happened, you must do some serious interpretation and guesswork.

pnkfelix (Feb 14 2019 at 12:46, on Zulip):

okay, but wouldn't you expect "delete old" to map to a delete of foo, not bar?

nagisa (Feb 14 2019 at 12:47, on Zulip):

oh that... well, all I can suggest is that most likely this is an artifact of the filesystem implementation in macos.

pnkfelix (Feb 14 2019 at 12:48, on Zulip):

there's definitely some magic happening in the layers here to reconstruct the operation from the events received from the FSEvents channel, I don't argue that point

matklad (Feb 14 2019 at 12:49, on Zulip):

and, as a reminder, we luckily don't actually care about the precise operation, we care only about the path that could have been affected.

pnkfelix (Feb 14 2019 at 12:50, on Zulip):

right. I would be entirely satisfied if I can coax notify into just telling us that something happened to any of {foo, bar}

pnkfelix (Feb 14 2019 at 12:50, on Zulip):

(as in, get it to report both names back to us and then we look up what happened.)

pnkfelix (Feb 14 2019 at 12:51, on Zulip):

but in the majority of the cases I'm looking at, the oddities in the raw event sequences seems to cause the debouncing to drop any mention of foo.

nagisa (Feb 14 2019 at 12:51, on Zulip):

and, as a reminder, we luckily don't actually care about the precise operation, we care only about the path that could have been affected.

Except it is entirely possible for the OS to not report any events for paths as well :slight_smile:

matklad (Feb 14 2019 at 12:51, on Zulip):

Yeah, that's what that "easy walkdir like API" issue is about

matklad (Feb 14 2019 at 12:52, on Zulip):

@nagisa hm, you mean I can write to a file and never be notified at all?

nagisa (Feb 14 2019 at 12:52, on Zulip):

Yes, that is very possible.

nagisa (Feb 14 2019 at 12:53, on Zulip):

A common example of that would be anything related to storage not directly attached to the same motherboard to which the host CPU is attached to. Think: network storage.

pnkfelix (Feb 14 2019 at 12:54, on Zulip):

...okay, but surely extreme cases like that should not lead us to lower our expectations for (reliable) behavior in the very common case of the local file system?

matklad (Feb 14 2019 at 12:54, on Zulip):

Sure, yeah

matklad (Feb 14 2019 at 12:55, on Zulip):

In general, we already ignore errors from the watcher. That is, if a watcher fails to start for path for "reasons", we just ignore it and proceede without waching

matklad (Feb 14 2019 at 12:55, on Zulip):

That works well because, for most of the changes, it's the editor who notifies us

matklad (Feb 14 2019 at 12:56, on Zulip):

Like, vfs watcher is a relatively new addition, we haven't watched a file system at all for a long time, and that actually worked pretty good. As is, you'd have to restart rust-analyzer when you switch git branches to force a rescan, but normal changes via VS Code worked ok

nagisa (Feb 14 2019 at 12:57, on Zulip):

I would prefer something that works every time. The practice shows that some people do use rust with at least networked storage, and network storage is definitely not the only example of this problem.

Whatever is done here, please just keep in mind that this is possible and provide options to your end users.

nagisa (Feb 14 2019 at 12:57, on Zulip):

(not that rustc works with networked storage right now, IIRC there are some ICEs here and there)

pnkfelix (Feb 14 2019 at 12:58, on Zulip):

I certainly wouldn't object to full refresh (or "rescan", whatever terminology is), either periodic and/or via-ui-command

matklad (Feb 14 2019 at 12:59, on Zulip):

Yeah, UI command for "refresh" is generally required.

pnkfelix (Feb 14 2019 at 13:03, on Zulip):

very odd. Some of the notify debouncing tests wait only 10ms after setting up the initial state. Others wait 35,000ms. (Specifically for Mac OS X, that is)

pnkfelix (Feb 14 2019 at 13:21, on Zulip):

(well, maybe not so odd, if people just add the delays in an ad hoc manner to make the tests pass. I just would have expected uniform treatment.)

matklad (Feb 14 2019 at 20:09, on Zulip):

So, the fundamental problem with vfs is that we don't test it on mac and windows on the CI...

At the same time, I don't want to add Windows and Mac to CI in general, to keep CI fast....

What about moving ra_vfs to a separate repo with a more extensive CI setup? There's nothing specific to Rust language in the vfs, so presumably we could just publish it to crates.io? It has a few local deps, but I think we can get rid of them.

pnkfelix (Feb 14 2019 at 20:24, on Zulip):

This sounds promising

matklad (Feb 15 2019 at 11:41, on Zulip):

I'll try moving ra_vfs outside then (not today, but soonish). Might be a good opportunity to check azure pipelines as a universal CI for all platforms as well.

One thing I am not sure about is if "memory overlay" bit should be in the separate repo or in the main repo. Currently it is a part of VFS, but I can image a factoring where VFS deals only with files on disk, and in-memory stuff is composed on top. I am inclined to move vfs as is.

pnkfelix (Feb 15 2019 at 12:01, on Zulip):

Okay cool

pnkfelix (Feb 15 2019 at 12:02, on Zulip):

I'm just happy to put a pin in this task. It was educational but also a bit frustrating at times.

pnkfelix (Feb 15 2019 at 12:04, on Zulip):

Regarding PR #837, I briefly considering trying to also put in all the 35 second delays that seemed to help, guarded under something like #[cfg(feature="slow_and_steady_because_fsevents")]. But I think in practice no one would ever use it.

Last update: Nov 12 2019 at 16:50UTC