Stream: t-compiler/wg-incr-comp

Topic: Handling of 'devirtualized' paths


view this post on Zulip Aaron Hill (Apr 06 2021 at 14:53):

When we load SourceFiles from an external crate, we try to 'devirtualized' paths that point into the standard library: https://github.com/rust-lang/rust/blob/e1d49aaad4fde33469fdb786c29838a95a5d8a11/compiler/rustc_metadata/src/rmeta/decoder.rs#L1633-L1682
The result of the devirtualiztion depends on global untracked state (whether or not the rust-src exists during the current compilation session), represented by sess.real_rust_source_base_dir. As a result, adding/removing the rust-src compoonent can lead to an ICE now that we verify query result hashes.

Currently, all external queries depend on the eval_always query crate_hash, which ensures that they get invalidated if an upstream crate is changed. I think the simplest way to fix this issue would be to make crate_hash depend on real_rust_source_base_dir. Unfortunately, this would mean that the crate computed seen during compilation of a crate is not necessarily the same as the crate hash that gets seen by a downstream crate.

Another alternative would be to make a new query crate_dependecy(CrateNum) -> (Svh, String) that explcitily includes real_rust_source_base_dir in its result. This would then cause external queries to be re-computed as necessary, without the need to modify the crate hash.

I don't really like either of these solutions - I'm open to any other suggestions. The root issue is that data loaded from foreign crates depends on the state of our current compilation session,, but there's nothing we can do about that.

view this post on Zulip bjorn3 (Apr 06 2021 at 15:04):

I think we should keep using the virtualized path everywhere and only use the devirtualized path to actually read the source files. Currently builds are not reproducible with rust-src installed or not installed as it leaks into the debuginfo as far as I know.

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:05):

I thought the leaking in the debuginfo was intentional, so that you get nice-looking paths in panic messages

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:06):

is there another use-case for devirtualized paths?

view this post on Zulip bjorn3 (Apr 06 2021 at 15:06):

Devirtualized paths are used to load the source for sysroot crates to show in error messages.

view this post on Zulip bjorn3 (Apr 06 2021 at 15:07):

Maybe the debuginfo was also intentional, but I don't like it.

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:07):

unfortunately, I think we cache diagnostics as part of a result

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:08):

and you could potentially emit a warning with a span that points into a devirtualized path

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:08):

so just using devirtualized paths for reading source files could still leak untracked global state into the query system

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:10):

not to mention the fact that that Cargo is unaware of the existence of rustc-src, so you could have a mixture of devirtualized and virtualized paths in the same crate graph

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:11):

I wonder if we could deprecate and remove the automatic detection, and require explictly passing in the path via a command-line argument

view this post on Zulip bjorn3 (Apr 06 2021 at 15:11):

Aaron Hill said:

so just using devirtualized paths for reading source files could still leak untracked global state into the query system

I mean delaying the devirtualization until the fs::read_to_string, so the SourceMap would still contain the virtualized paths.

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:12):

hmm

view this post on Zulip Aaron Hill (Apr 06 2021 at 15:13):

that could work - we'd want some kind of wrapper around the RealFileName fields to ensure that a query can't use the actual path bytes


Last updated: Oct 21 2021 at 20:03 UTC