Stream: rustdoc

Topic: crate loading during link resolution


view this post on Zulip Aaron Hill (May 29 2021 at 21:22):

@Joshua Nelson I was talking with @GuillaumeGomez about the rustdoc ICE in https://github.com/mautamu/spirv-std-3

view this post on Zulip Aaron Hill (May 29 2021 at 21:22):

which is related to trying to resolve a link in a doc comment on a proc-macro crate

view this post on Zulip Aaron Hill (May 29 2021 at 21:22):

while we're documenting some other crate that depends on the proc-macro

view this post on Zulip Aaron Hill (May 29 2021 at 21:23):

looking at the ICE, the issue appears to be caused by rustdoc loading some crate (and therefore allocating a new CrateNum) that the tcx doesn't know about

view this post on Zulip Aaron Hill (May 29 2021 at 21:23):

which is why we get an ICE in tcx.crate_num(bad_def_id)

view this post on Zulip Joshua Nelson (May 29 2021 at 21:25):

@Aaron Hill the root cause is https://github.com/rust-lang/rust/issues/83761. The proximate cause is the hack to load crates early is missing a crate for some reason. I don't know why it would be missing

view this post on Zulip Aaron Hill (May 29 2021 at 21:25):

from what I can see, rustdoc clones the resolver, uses it to resolve things (including potentially loading new crate),and then uses the original TyCtxt (with its original resolver)

view this post on Zulip Aaron Hill (May 29 2021 at 21:25):

oh, so we try to load those crates before the TyCtxt is created?

view this post on Zulip Joshua Nelson (May 29 2021 at 21:25):

Yes

view this post on Zulip Joshua Nelson (May 29 2021 at 21:26):

Fixing the root cause is hard and will take a solid 2 months IMO, I don't think we should try to tackle that right away

view this post on Zulip Aaron Hill (May 29 2021 at 21:26):

ok, that makes sense

view this post on Zulip Joshua Nelson (May 29 2021 at 21:26):

I put in about 2 weeks on it a while back and had to stop for my mental health, I expected it to be a ton of work and it was even more than I expected

view this post on Zulip Joshua Nelson (May 29 2021 at 21:26):

Fixing the proximate issue should be a lot easier

view this post on Zulip Joshua Nelson (May 29 2021 at 21:27):

I don't really understand the MCVE they made

view this post on Zulip Joshua Nelson (May 29 2021 at 21:27):

I tried writing my own without a proc macro but it didn't reproduce the crash

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:27):

I think the problem is around the proc-macro there

view this post on Zulip Aaron Hill (May 29 2021 at 21:27):

ok, here's my current guess

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:27):

They are handled a bit differently and it impacts dependencies from what @Aaron Hill told me

view this post on Zulip Aaron Hill (May 29 2021 at 21:28):

the crate we're trying to link to (spirv_types) is referenced in two palces

view this post on Zulip Aaron Hill (May 29 2021 at 21:28):

one place is a normal dependency of the crate we're tryng to document (not a proc-macro crate)

view this post on Zulip Aaron Hill (May 29 2021 at 21:28):

the other place is in a doc-comment of a proc-macro crate - that proc-macro crate is a dependency of the crate we're trying to load

view this post on Zulip Aaron Hill (May 29 2021 at 21:29):

the first time through (when we try to load the crates), we fail for some reason (possibbly because of the fact that we don't load proc-macro dependencies)

view this post on Zulip Aaron Hill (May 29 2021 at 21:30):

during the second pass, we succeed, which means that our cloned resolver has a crate loaded that thr tcx doesn't know about

view this post on Zulip Aaron Hill (May 29 2021 at 21:30):

the crate spirv_types is only ever referenced from the proc-macro

view this post on Zulip Aaron Hill (May 29 2021 at 21:30):

which I think is important

view this post on Zulip Aaron Hill (May 29 2021 at 21:31):

yup

view this post on Zulip Aaron Hill (May 29 2021 at 21:31):

adding use spirv_types to the top of lib.rs fixes the ICE

view this post on Zulip Joshua Nelson (May 29 2021 at 21:32):

@Aaron Hill that tells the resolver about it, but there's actually an even narrower workaround, you can add an intra doc link to it in lib.rs

view this post on Zulip Joshua Nelson (May 29 2021 at 21:33):

And that will be picked up by rustdoc's early loader

view this post on Zulip Aaron Hill (May 29 2021 at 21:33):

oh, I see the problem

view this post on Zulip Aaron Hill (May 29 2021 at 21:33):

I'm pretty sure that importing the proc-macro does not affect the AST in the normal way

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:34):

Can a crate be loaded twice and mess up IDs somehow?

view this post on Zulip Aaron Hill (May 29 2021 at 21:34):

as it does for other imports

view this post on Zulip Aaron Hill (May 29 2021 at 21:34):

so the early collector doesn't see it

view this post on Zulip Aaron Hill (May 29 2021 at 21:34):

let me test this

view this post on Zulip Joshua Nelson (May 29 2021 at 21:34):

@GuillaumeGomez you can test if that's the issue or not, just add a debug statement to the early loader that prints the link

view this post on Zulip Joshua Nelson (May 29 2021 at 21:35):

If you don't see spirv_types that means duplicate CrateIds aren't related (whether or not they exist; they still won't be relevant)

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:35):

gonna check it then

view this post on Zulip Aaron Hill (May 29 2021 at 21:37):

@Joshua Nelson How does IntraLinkCrateLoader process attributes on itms imported from other crates?

view this post on Zulip Aaron Hill (May 29 2021 at 21:37):

e.g. if I do use some_other_crate::SomeStruct;

view this post on Zulip Aaron Hill (May 29 2021 at 21:38):

and SomeStruct has a doc comment link referencing third_party_crate

view this post on Zulip Aaron Hill (May 29 2021 at 21:38):

how does third_party_crate get loaded?

view this post on Zulip Joshua Nelson (May 29 2021 at 21:38):

I don't know. However the code is written; I think I assumed that would work because the attribute still gets loaded bit I haven't tested

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:39):

reexported-items in rustdoc is kinda tape over tape currently...

view this post on Zulip Aaron Hill (May 29 2021 at 21:39):

that appears to be the problem here

view this post on Zulip Aaron Hill (May 29 2021 at 21:40):

I'm not sure how that even works for normal items

view this post on Zulip Aaron Hill (May 29 2021 at 21:40):

IntraLinkCrateLoader only implements visit_attribute and visit_item

view this post on Zulip Aaron Hill (May 29 2021 at 21:40):

which isn't going to try to 'walk through' any use statements

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:40):

I only tried to add visit_foreign and same for macros

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:41):

I wonder if I didn't try visit_use_tree unsuccessfully too...

view this post on Zulip Aaron Hill (May 29 2021 at 21:42):

the answer is that it doesn't work :)

view this post on Zulip Aaron Hill (May 29 2021 at 21:42):

I changed the macros crate to a normal crate (exporting a struct), and got the same ICE

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:42):

Ah right I did! The problem was that I can't get a DefId from a NodeId but ohterwise it didn't change anything

view this post on Zulip Aaron Hill (May 29 2021 at 21:42):

I'll push up a fork

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:43):

oh, that's interesting

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:43):

so the problem is reexported items once again

view this post on Zulip Aaron Hill (May 29 2021 at 21:43):

https://github.com/Aaron1011/spirv-std-3/tree/no-macro

view this post on Zulip Aaron Hill (May 29 2021 at 21:43):

so, the simplest solution would be to skip processing of intra-doc links on items imported from another crate

view this post on Zulip Aaron Hill (May 29 2021 at 21:44):

which should hopefully be a small enough diff for a beta backport

view this post on Zulip Joshua Nelson (May 29 2021 at 21:44):

That is a breaking change

view this post on Zulip Joshua Nelson (May 29 2021 at 21:44):

It means all the links will be broken in the docs

view this post on Zulip Aaron Hill (May 29 2021 at 21:44):

oh

view this post on Zulip Aaron Hill (May 29 2021 at 21:44):

because re-exported links work if the crate is already imported?

view this post on Zulip Joshua Nelson (May 29 2021 at 21:44):

Because docs on the original are still displayed whenever docs are inlined

view this post on Zulip Aaron Hill (May 29 2021 at 21:45):

I see

view this post on Zulip Aaron Hill (May 29 2021 at 21:45):

a more complicated solution would be to disable crate loading in the TyCtxt resolver

view this post on Zulip Aaron Hill (May 29 2021 at 21:45):

and return an error whenever it would occur

view this post on Zulip Aaron Hill (May 29 2021 at 21:46):

which will cause us to fail to resolve the problematic links, instead of producing an ICE

view this post on Zulip Joshua Nelson (May 29 2021 at 21:46):

That is also not very satisfying; the links will still be broken - yeah

view this post on Zulip Aaron Hill (May 29 2021 at 21:46):

did they ever work?

view this post on Zulip Joshua Nelson (May 29 2021 at 21:46):

Is it not possible to load the attributes of the original item before the tcx is built?

view this post on Zulip Joshua Nelson (May 29 2021 at 21:46):

@Aaron Hill yes, back when rustdoc loaded all crates unconditionally

view this post on Zulip Aaron Hill (May 29 2021 at 21:46):

the specific kind of case in that repository, I mean - not the general re-exported links where the crate is already loaded

view this post on Zulip Aaron Hill (May 29 2021 at 21:47):

well, an alternate solution would be to try to 'fully explore' all of the items in the imported visitor

view this post on Zulip Aaron Hill (May 29 2021 at 21:47):

which I think would mean trying to 'follow' use statements in the AST visitor

view this post on Zulip Aaron Hill (May 29 2021 at 21:47):

not sure how hard that would be

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:48):

I tried but got stuck quicky trying to get something out of the NodeId...

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:50):

I was actually able to get a DefId from the NodeId, but I think it was referring to the use and not to the item imported (but maybe I'm wrong there)

view this post on Zulip Joshua Nelson (May 29 2021 at 21:52):

Aaron Hill said:

well, an alternate solution would be to try to 'fully explore' all of the items in the imported visitor

I think it's worth trying, at least for an hour or so.

@Aaron Hill interested to hear your opinions on the root cause btw - I suggested making the resolver still available in the TyCtxt, which @Vadim Petrochenkov was hesitant to do because it's a giant change. But the rustdoc change he suggested is at least as big and doesn't help with long-term goals the way making the resolver always available does. https://github.com/rust-lang/rust/issues/83761#issuecomment-812723093

Probably the next step forward for that is to make an MCP and then dedicate lots of time to getting it working.

view this post on Zulip Aaron Hill (May 29 2021 at 21:53):

I don't have a strong opinion on that

view this post on Zulip Aaron Hill (May 29 2021 at 21:53):

I know very little about how how path resolution works

view this post on Zulip Joshua Nelson (May 29 2021 at 21:53):

np, I will try to do some reading on proposals for making path resolution incremental so I can summarize them in the MCP

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:55):

Thanks to both of you. In the meantime, if we can't make a fix, there is still https://github.com/rust-lang/rust/pull/85749 (which revert to loading ALL crates)

view this post on Zulip GuillaumeGomez (May 29 2021 at 21:56):

We still have a few weeks before next release though, so let's just keep it as backup

view this post on Zulip Vadim Petrochenkov (May 30 2021 at 09:47):

I see there's some progress on minimizing #84738, could someone post a minimized self-contained reproduction to https://github.com/rust-lang/rust/issues/84738?

view this post on Zulip Joshua Nelson (May 30 2021 at 12:05):

(deleted)

view this post on Zulip Joshua Nelson (May 30 2021 at 12:09):

Done

view this post on Zulip Vadim Petrochenkov (Aug 07 2021 at 23:10):

I'm going to revert https://github.com/rust-lang/rust/pull/85749 so I've been investigating https://github.com/rust-lang/rust/issues/84738.

view this post on Zulip Vadim Petrochenkov (Aug 07 2021 at 23:13):

It looks like the issue reproduces only when doc links are used in a "full form" like this:

/// [`SomeStruct`]
///
/// [`SomeStruct`]: proc_macro_dep::SomeStruct

but not

/// [`proc_macro_dep::SomeStruct`]

I suspect that IntraLinkCrateLoader simply didn't visit paths in full links like this.


Last updated: Oct 11 2021 at 22:34 UTC