Stream: rustdoc

Topic: New lint: unnecessary explicit link (#87799)


view this post on Zulip marian (Sep 04 2021 at 13:33):

Hi, I claimed this issue because it seemed like a good first one, and spent a few days looking at collect_intra_doc_links and around, starting from the point that was suggested in there (resolve_link).

I have one initial problem that is slowing down trying things and in general debugging/figuring out what's what. I figured I could execute these paths by creating a separate tiny cargo project, with one or two lines of docs on top of main, and use my stage 1 rustdoc on it. It's a pretty slow process: change a line in rustdoc, build, run it on another project. I doubt anyone works like that, what am I missing? Debug output is so thick I've had to change some lines to info! and I'm looking at that instead because it's more manageable. I feel there's some basic workflow thing I missed.

On that line, I also tried to find some unit tests that maybe would place me in that context, so that instead maybe i could just run some specific tests and get quicker feedback, but I'm afraid the unit tests cover rustdoc more generally, not the intra-doc links pass in particular, and the UI tests are probably going to be around the last things I'll be able to test :(

Finally, for something more concrete: i intend to add this check within resolve_link. In order to know whether [Text](path::to::Text) is equivalent to [Text], I believe I would need to run the actual resolution (resolve_with_disambiguator_cached) a second time on a "made up" MarkdownLink and check if the returned Res's DefId is equal to the one for the actual link. Does this make sense?

I have some more smaller questions but I think that's enough to start the topic. Thank you

view this post on Zulip Joshua Nelson (Sep 04 2021 at 14:06):

It's a pretty slow process: change a line in rustdoc, build, run it on another project. I doubt anyone works like that, what am I missing?

Nope, this is pretty much how I work :sweat_smile: how long are the builds taking you? For me they're around 20 seconds for an incremental change.

Debug output is so thick I've had to change some lines to info! and I'm looking at that instead because it's more manageable.

What log variable are you setting? I usually use RUSTDOC_LOG=rustdoc::passes::collect_intra_doc_links=debug, which filters out both trace! logs and anything from other modules.

On that line, I also tried to find some unit tests that maybe would place me in that context, so that instead maybe i could just run some specific tests and get quicker feedback, but I'm afraid the unit tests cover rustdoc more generally, not the intra-doc links pass in particular, and the UI tests are probably going to be around the last things I'll be able to test :(

I don't think they would speed things up much anyway - you'd still have to build rustdoc to run the tests. Have you found the intra-doc link tests? You can run them with x.py test src/test/rustdoc/intra-doc src/test/rustdoc-ui/intra-doc, and they run reasonably quickly (~7 seconds for me). You can add your own UI test which means you don't have to switch between directories to test your changes.

I believe I would need to run the actual resolution (resolve_with_disambiguator_cached) a second time on a "made up" MarkdownLink and check if the returned Res's DefId is equal to the one for the actual link.

Hmm, I think you're right that you don't need all of resolve_link, just resolve_with_disambiguator_cached. You shouldn't need a full MarkdownLink though, just the module, disambiguator, and path_str. You should already have the module as a parameter, and you can get the disambiguator and path_str by calling preprocess_link.

view this post on Zulip marian (Sep 04 2021 at 14:41):

how long are the builds taking you? For me they're around 20 seconds for an incremental change.

Just checked, it took a little under a minute. It's not terribly bad, I just wanted to make sure there wasn't a better way i wasn't aware of. :smile:

What log variable are you setting?

Here's my janky debug one-liner: rm -r -Force .\target\doc; $env:RUSTDOC_LOG = 'rustdoc::passes::collect_intra_doc_links=info'; $env:RUSTDOC = 'C:\src\rust\build\x86_64-pc-windows-msvc\stage1\bin\rustdoc.exe'; cargo doc (i started doing this on windows/powershell and now i don't know how to set the variable for the entire session so I'm just setting it before the command). This is not so bad because in any case I can just turn debug into info. My problem with debug is that there is so much more activity besides the processing of my own test docs that it gets lost. Now that i'm writing this down, maybe i just need to pipe the output into a file

Have you found the intra-doc link tests?

I saw them, they run fairly quickly (30s for rustdoc tests, 6s for rustdoc-ui), but since I would've needed to (i think) create the new lint and be able to output the warning in order to make use of those tests, i kind of discarded that option for now.

I'll spend some more time on this and have some more questions soon. Thanks a lot for the help! it's much appreciated

view this post on Zulip Joshua Nelson (Sep 04 2021 at 14:44):

since I would've needed to (i think) create the new lint and be able to output the warning in order to make use of those tests

Hmm, I think you should be able to add a failing test even before you add the lint.

Have you been using x.py check src/tools/rustdoc btw? It's a lot faster than a full build.

view this post on Zulip marian (Sep 04 2021 at 14:46):

good points, both. I'll see if adding a test is an easier way to get my changes to run than running rustdoc on a separate project

view this post on Zulip Noah Lev (Sep 04 2021 at 17:41):

There's also a hybrid solution that avoids switching directories, which is what I often use: create a single-file crate (foo.rs, say) and then run rustdoc +stage1 foo.rs. I often do this if I want to quickly test something.

view this post on Zulip Noah Lev (Sep 04 2021 at 17:42):

So that single-file crate could just be this:

//! [Iterator](std::iter::Iterator)

view this post on Zulip Joshua Nelson (Sep 04 2021 at 17:42):

that's assuming you have two terminals open though, right? at some point you need to run x.py build

view this post on Zulip Noah Lev (Sep 04 2021 at 17:42):

Yes, you do need to run x build.

view this post on Zulip Noah Lev (Sep 04 2021 at 17:42):

But it avoids having to switch directories

view this post on Zulip Noah Lev (Sep 04 2021 at 17:42):

Why would you need two terminals open?

view this post on Zulip Joshua Nelson (Sep 04 2021 at 17:42):

oh oh, you mean to put the single file in the checkout of rust-lang/rust

view this post on Zulip Noah Lev (Sep 04 2021 at 17:43):

Yes, that's what I usually do.

view this post on Zulip Joshua Nelson (Sep 04 2021 at 17:43):

I don't like doing that because the generated docs end up in doc/ which adds untracked files. But I'm realizing now that the docs for rust-lang/rust live in src/doc, not doc/, so it's easy enough to run rm -r doc.

view this post on Zulip marian (Sep 05 2021 at 23:28):

Thanks again for the suggestions and the help. I've managed to get a more manageable test case with Noah's suggestion, and got a basic proof of concept.

Turns out that even resolve_with_disambiguator_cached is too much for this. Since this is just an attempt at resolving a hypothetical link, we don't need the resolution failure to be reported so maybe resolve will be enough. It seems there is useful resolving logic in resolve_with_disambiguator, trying with the given disambiguator first and otherwise trying all namespaces, that should also run when resolving the shortcut link, but unfortunately this method also reports diagnostics. Maybe it could just return Result without side effects, and its callers could handle the diagnostics output.

view this post on Zulip Joshua Nelson (Sep 06 2021 at 00:20):

:+1: seems reasonable to change, just a little tricky

view this post on Zulip Joshua Nelson (Sep 06 2021 at 20:12):

wait hold on I misread:

It seems there is useful resolving logic in resolve_with_disambiguator, trying with the given disambiguator first and otherwise trying all namespaces, that should also run when resolving the shortcut link, but unfortunately this method also reports diagnostics.

I don't think we need to retry with a different disambiguator; it seems really weird for anyone to write [type@Iterator](type@std::iter::Iterator). Just trying the original link text seems fine.

view this post on Zulip Joshua Nelson (Sep 06 2021 at 20:13):

in particular, I don't think we should suggest changing [Iterator](type@Iterator) to [type@Iterator].

view this post on Zulip marian (Sep 06 2021 at 21:44):

that makes sense. So it would be reasonable to skip this whole check whenever a disambiguator has been specified

view this post on Zulip Joshua Nelson (Sep 06 2021 at 21:53):

I think we should skip the check whenever we're doing it for the lint. We should never suggest adding a disambiguator to the link text, even if it would resolve to a valid link.

view this post on Zulip marian (Sep 06 2021 at 21:57):

sorry, I meant to say: the entire new lint check (namely, the second resolution) should be skipped whenever a disambiguator has been specified. Even if it's on a link like (i don't know if this is valid) [Iterator](type@std::iter::Iterator) -- in this case we would let that be and will not warn or suggest anything.

view this post on Zulip Joshua Nelson (Sep 06 2021 at 22:01):

I don't see why your example is relevant - [Iterator] would be fine to suggest.

view this post on Zulip Joshua Nelson (Sep 06 2021 at 22:01):

we don't want to suggest a disambiguator in the link text: [type@Iterator]. But whether the current link target uses a disambiguator or not doesn't matter.


Last updated: Oct 11 2021 at 22:34 UTC