Stream: t-compiler/rust-analyzer

Topic: Diagnostics working group?


matklad (Mar 29 2021 at 11:33, on Zulip):

Hey, with proc-macros enabled by default, I think we no longer have excuses for not producing diagnostics for everything we can diagnose. It seems like it's high time to make our diagnostics infra "production ready". This includes three things:

This sounds like a great task for the next sprint. Anyone is interested in this topic?

Florian Diebold (Mar 29 2021 at 12:08, on Zulip):

also a strategy for merging our diagnostics with rustc's, IMO

Florian Diebold (Mar 29 2021 at 12:09, on Zulip):

starting to emit typechecker diagnostics is my plan after finishing the chalk_ir move as well

Florian Diebold (Mar 29 2021 at 12:16, on Zulip):

so, I guess I am interested, but currently still mostly focused on Chalk

matklad (Mar 29 2021 at 12:19, on Zulip):

@Florian Diebold by "merging" you mean deduplication in the UI, or sharing some common library?

Florian Diebold (Mar 29 2021 at 12:20, on Zulip):

yeah, deduplication

matklad (Mar 29 2021 at 12:25, on Zulip):

That falls under "polishing UX". Shouldn't be too hard -- we need to match diagnostic codes and see if the ranges intersect.

Florian Diebold (Mar 29 2021 at 12:25, on Zulip):

yeah

matklad (Mar 29 2021 at 12:26, on Zulip):

Well, there's this infinitelly annoying issue that VS Code won't shift ranges of diagnostics on editing, but I don't feel we can do anything on our side to fix it, short of using Emacs instead.

Florian Diebold (Mar 29 2021 at 12:27, on Zulip):

hmm. I think we've previously discussed this, but isn't this a problem with the rustc diagnostics specifically because we only refresh them on save, but (have to) resend them on every change?

Jonas Schievink [he/him] (Mar 29 2021 at 12:27, on Zulip):

yeah, I think that's on us

bjorn3 (Mar 29 2021 at 12:28, on Zulip):

When we get the diagnostics form rustc could we try to match the spans to tokens and store some relatively stable id for the tokens? And then re-compute the spans every time we push the diagnostics to the editor.

matklad (Mar 29 2021 at 12:29, on Zulip):

That works for check on save diags, but not from the ones that are produced by a problem matcher (I think?)

bjorn3 (Mar 29 2021 at 12:29, on Zulip):

Probably. But at least it may solve half of the problem.

Florian Diebold (Mar 29 2021 at 12:30, on Zulip):

the problem matcher is a pure VSCode issue anyway, the problem with the check on save diagnostics exists in every client

matklad (Mar 29 2021 at 12:33, on Zulip):

Also, as a general reminder, there's this grand diagnostics dilemma for IDEs:

At the moment, rust-analyzer works in the first paradigm. It might be the case that, with explicit crate graph, good incremental and fast implementation, the second would be feasible even for arbitraty big projects.

IntelliJ does 1 as well. Roslyn I believe dos 2 by default, but they have a checkbox to switch to 1 for large projects.

This also intersects with the search dilemma -- should we use tri-gram text search + semantic prunning (today's approach) or sould we search semantic info directly?

bjorn3 (Mar 29 2021 at 12:34, on Zulip):

An option to switch between the two methods would be nice.

bjorn3 (Mar 29 2021 at 12:36, on Zulip):

This also intersects with the search dilemma -- should we use tri-gram text search + semantic prunning (today's approach) or sould we search semantic info directly?

The current way doesn't result in reduced accuracy, right?

Jonas Schievink [he/him] (Mar 29 2021 at 12:36, on Zulip):

For macros it does

Jonas Schievink [he/him] (Mar 29 2021 at 12:37, on Zulip):

We could make it expand every single macro and search the outputs too

Florian Diebold (Mar 29 2021 at 12:37, on Zulip):

personally, I would prefer 1) with a button that lists all diagnostics everywhere and lets you jump to them

Jonas Schievink [he/him] (Mar 29 2021 at 12:39, on Zulip):

I think we need GC in order to make option 2 work in practice

Jeremy Kolb (Mar 29 2021 at 13:50, on Zulip):

Diagnostic pull support is being played with right now for LSP 3.17

Jonas Schievink [he/him] (Apr 09 2021 at 21:05, on Zulip):

This working group would probably be interested in building more general infra for diagnostics, so that we won't have to check #[allow] attributes for every single diagnostic

Jonas Schievink [he/him] (Apr 10 2021 at 19:05, on Zulip):

I also think we should, at some point:

...because they do tend to annoy users more than I'd like

Florian Diebold (Apr 10 2021 at 19:17, on Zulip):

we do get a lot of useful bug reports that way though ;) ... maybe we could deactivate them in weekly builds, but keep them in nightlies and self-compiled

Florian Diebold (Apr 10 2021 at 19:18, on Zulip):

OTOH there's a certain degree of experimentalness I would keep turned off by default all the time (e.g. if we were to add type mismatch errors right now)

Jonas Schievink [he/him] (Apr 10 2021 at 19:21, on Zulip):

yeah the bug reports are very useful

Jonas Schievink [he/him] (Apr 10 2021 at 19:22, on Zulip):

it'd be nice if we could offer a quick fix that turns off a diagnostic

Jonas Schievink [he/him] (Apr 10 2021 at 19:22, on Zulip):

don't think it's possible to make that persistent though

Jonas Schievink [he/him] (Apr 10 2021 at 19:22, on Zulip):

well, only in an ad-hoc way at least

Laurențiu (Apr 11 2021 at 11:00, on Zulip):

In Code we can, I guess

Florian Diebold (Apr 11 2021 at 11:05, on Zulip):

we could by making it possible to disable the diagnostics either in Cargo.toml or with attributes

Florian Diebold (Apr 11 2021 at 11:06, on Zulip):

of course people might not want to spam their code with rust_analyzer::allow attributes

Florian Diebold (Apr 11 2021 at 11:07, on Zulip):

then again, maybe the diagnostics where that would happen a lot would be better off disabled by default

Florian Diebold (Apr 11 2021 at 11:08, on Zulip):

(obviously, the more false positives a diagnostic has, the less we need feedback from users for it...)

Florian Diebold (Apr 11 2021 at 11:09, on Zulip):

maybe we could have a policy of "diagnostics that trigger on the RA codebase (and maybe also codebases X and Y) can't be on by default"

Jonas Schievink [he/him] (May 19 2021 at 14:51, on Zulip):

One thing this working group might be interested in is rustc's approach at making diagnostics compatible with incremental computation. It works by storing them all in a Vec while a query executes, and the query system effectively makes them part of the query result.

This sounds like a solution that could make our lives a lot easier, since we wouldn't have to manually buffer and emit them from every place that could possibly want to produce a diagnostic.

Jonas Schievink [he/him] (May 19 2021 at 14:51, on Zulip):

This would require salsa support however

Florian Diebold (May 19 2021 at 15:34, on Zulip):

I have doubts that approach would work for us? We still have to know which queries to trigger to get 'all' diagnostics anyway

Florian Diebold (May 19 2021 at 15:36, on Zulip):

I mean, it might be nice to not have to collect the diagnostics explicitly, but I don't know if it's worth it

Florian Diebold (May 22 2021 at 16:07, on Zulip):

I'm wondering a bit on what level diagnostics should actually exist. Are they an IDE thing, or a HIR (Semantics) thing? obviously things like type errors or missing match arms need to ultimately come from the compiler infrastructure, but should things like "replace filter_map(...).next() with find_map(..)"? At least we'd want this kind of thing to be implementable quite separately from the compiler parts (maybe even with plugins). OTOH we still need it to be integrated in Salsa. And quick fixes should almost certainly be calculated by the IDE parts and probably not live in the DB.

Florian Diebold (May 22 2021 at 16:10, on Zulip):

maybe the answer is that we shouldn't / don't need to have a unified concept of diagnostics in the HIR crates, just collect various kinds of errors that could be simple, disparate enums (e.g. one TypeError enum for various errors emitted during inference). And then those could be collected and turned into proper diagnostics by the same IDE-level mechanism that would also calculate purely IDE-level lints like replace-filter-map-next-with-find-map

Florian Diebold (May 22 2021 at 16:12, on Zulip):

and that one would probably not need to keep structured information about the error, which would make the whole dyn Diagnostic stuff unnecessary since it could just be one struct with code, location, description and fix

Florian Diebold (May 22 2021 at 16:19, on Zulip):

there's still the problem of needing to duplicate all these hir_*-level error enums for the hir facade though

Florian Diebold (May 22 2021 at 16:21, on Zulip):

also, it does leave the whole problem of "what's full set of possible diagnostics, and which queries do we need to run for it" to the IDE crate, but maybe that's not actually a problem, since we need to implement the code to describe the error and possibly provide a quickfix there anyway

Florian Diebold (May 22 2021 at 16:27, on Zulip):

with rust-analyzer/rust-analyzer#8713, maybe the interface for diagnostics in the IDE crate could just be basically a function taking some kind of HIR node, and then the driver could even automatically handle doing the tree walk, deciding for which files to calculate diagnostics, and #[allow] etc.

matklad (May 22 2021 at 18:41, on Zulip):

For production of diagnostics, I think there are clearly two very dissimilar paths.

Some diagnostics are emitted during analysis. For example, you are inferring types, and you fail to resolve a method call, so you should emit a diagnostic here.

Some diagnostics exist completely orthogonal to the standard code analysis. For example , doing unsafe op outside of unsafe block has no bearing on the actual analysis. This diagnostics should be computed by a separate walk over the code, after the analysis is done.

matklad (May 22 2021 at 18:43, on Zulip):

maybe the answer is that we shouldn't / don't need to have a unified concept of diagnostics in the HIR crates,

Yeah, the dyn Diagnostic is a wholly unnecessary boondoggle, we should eliminate that. I think that maybe we should do that first, and then maybe a better design emerges.

Florian Diebold (May 22 2021 at 18:55, on Zulip):

For production of diagnostics, I think there are clearly two very dissimilar paths.

Yeah, that's what I mean -- and I think the first path doesn't necessarily need to be viewed as part of the diagnostics infrastructure, it's just some additional outputs of some queries. Then there's only the second path, which can fully live in the IDE crate.

Florian Diebold (May 23 2021 at 10:00, on Zulip):

or to be a bit more concrete, do you think that whatever hir_ty produces should already have diagnostic codes or even a 1:1 relationship to the diagnostics shown to the user in the end?

matklad (May 23 2021 at 10:01, on Zulip):

No, I think ty should just emit enough data to reconstruct the diags

matklad (May 23 2021 at 10:02, on Zulip):

IIRC, we kinda already emit such data, we just then convert it to the dyn Diagnostic. If we remove this conversion, and just expose the data as is, the architecture becomes less messy

matklad (May 23 2021 at 10:04, on Zulip):

Then there's only the second path, which can fully live in the IDE crate.

I am not sure about this though. Feels like the second path is still subdivided in two.

From the implementation point of view, "don't call unsafe fn outside of unsafe block" and "don't call filter_map().next()" are exactly equivalent. But they are different in semantics. The first should be a compilation error, the second is a lint

Florian Diebold (May 23 2021 at 10:08, on Zulip):

so you'd expect the first one to still happen in some hir_* crate? I guess if we were implementing a full compiler, it would need to, so that it knows to not generate code, but that seems like the only meaningful difference?

matklad (May 23 2021 at 10:15, on Zulip):

I don't know. I guess, I'd expect it to happen in the hir crate, without the suffux (or in a special hir_diagnostics crate).

Argh, this is reeealy hard. One side of me thinks "let's move this to ide and just implement that on top of hir". But then another part thinks that the most natural way to express this diagnostics is for our desugared expressions.

matklad (May 23 2021 at 10:17, on Zulip):

Hm, I guess this is a real design question here:

Some IDE features could be implemented on top of such IRs more conveniently then on top of HIR. What to do?

matklad (May 23 2021 at 10:56, on Zulip):

Hm, mulling over this some more, it seems this is exactly the layering trade off we want to make.

The simple API make easy things simple, but it might make complex things impossible or overly awkward. Down the line, I expect us to find at least some situations where, when writing an assist, we wish we had a direct access to the database. But I am not a fan of exposing this -- it's useful that things like assits don't know that we are using chalk inside.

I wonder, how roslyn exposes thinkgs like control flow graph?

matklad (May 23 2021 at 11:02, on Zulip):

https://docs.microsoft.com/en-us/dotnet/api/microsoft.codeanalysis.semanticmodel.getoperation?view=roslyn-dotnet-3.9.0#Microsoft_CodeAnalysis_SemanticModel_GetOperation_Microsoft_CodeAnalysis_SyntaxNode_System_Threading_CancellationToken_

matklad (May 23 2021 at 11:03, on Zulip):

https://docs.microsoft.com/en-us/dotnet/api/microsoft.codeanalysis.flowanalysis.basicblock?view=roslyn-dotnet-3.9.0

matklad (May 23 2021 at 11:04, on Zulip):

so roslyn does expose some additional IR besides syntax nodes.

Florian Diebold (May 23 2021 at 11:08, on Zulip):

yeah, I would expect us to expose some more information through HIR when we need it and can figure out a nice API for it. And if the unsafe diagnostic needs such detailed information that it's nicer to implement inside HIR, I would expect that we implement some query in there that does the hard analysis part and then just use that for the final diagnostic living in ide. That's kind of orthogonal to the question of "should there be a concept of errors in the HIR crates" though, or "should the unsafety diagnostic live in HIR because it would be a compiler error"

matklad (May 23 2021 at 11:15, on Zulip):

Yeah, I don't think hir crates need to know about diagnostics, they need to emit facts, and its up to the higher layers to translate some of the facts into errror

matklad (May 23 2021 at 16:10, on Zulip):

Let me try to look into what it takes to remove dyn diagnostics (unless someone is already looking into this...)

matklad (May 23 2021 at 21:12, on Zulip):

Predictably, I ended up polishing the generate getter assist

matklad (May 24 2021 at 14:31, on Zulip):

Question: how we should test diagnostics in hir_xxx crates? At the moment, we rely on the Diagnostic crate to render messages an ranges.

I am tempted to just write some glue code for "rendering for testing"

matklad (May 24 2021 at 16:52, on Zulip):

Here's how far I got: https://github.com/rust-analyzer/rust-analyzer/pull/8973

matklad (May 24 2021 at 16:54, on Zulip):

I am not trying to remove dyn right of the bat, and just want to move it further up the stack. Modulo test, it successfully jumped from hir_expand to hir_ty

matklad (May 24 2021 at 20:02, on Zulip):

Hm, I am somewhat unsatisfied with our testing story here....

I think we don't want low-level code to be concerned with figuring out the specific range to be highlighted. That means we won't be able to use precise ranges in our low-level tests.

But we do want to have tests for ranges somewhere, so we'll have to duplicate tests somewhat for hight-level diagnostics...

matklad (May 25 2021 at 15:02, on Zulip):

Finished first part of this: https://github.com/rust-analyzer/rust-analyzer/pull/8973

matklad (Jun 12 2021 at 14:13, on Zulip):

Thinking more about this, I think it would be fine to only test diagnostic in the IDE layer. This is similar to various other auxilary data we have in InferenceResults. We don't have dedicated tests for method resolution tables and such, and that seems fine.

So, we'll probably just completely remove manual test traversals to collect diagnostics, and instead will have a giant hight-level "UI" test suite

matklad (Jun 13 2021 at 11:57, on Zulip):

https://github.com/rust-analyzer/rust-analyzer/pull/9245 finally got to the "end game" here, now need to only refactor the other diagnostics

matklad (Jun 13 2021 at 13:08, on Zulip):

Next question is, what should diagnostic store? At the moment, hir structs store AstPtr's, and that's stupid.

A diagnostic emerges from the guts of the compiler with something like ExprId. The hir crate then converts that to a ptr, often going via a syntax node. Finally, an IDE converts the ptr back to syntax node for rendering purposes. That's clearly suboptimal :D

Florian Diebold (Jun 13 2021 at 13:14, on Zulip):

imo we should have a HIR wrapper for ExprId (or the full tree-based HIR) and provide that, and then IDE can use that to map to the AST node

Florian Diebold (Jun 13 2021 at 13:14, on Zulip):

can we just add a simple hir::Expr for now?

matklad (Jun 13 2021 at 13:19, on Zulip):

Yeah, that's reasonable, although I am not sure about "just simple" part :rofl:

hir::Expr { id: ExprId, parent: DefWithBodyId } probably won't cut it

matklad (Jun 13 2021 at 13:19, on Zulip):

ExprId are desugared, but I am pretty sure that hir::Expr wants to be surface-level expr

Florian Diebold (Jun 13 2021 at 14:00, on Zulip):

hm. do we do anything other than source_map.expr_syntax to get the AST pointer right now though?

Florian Diebold (Jun 13 2021 at 14:01, on Zulip):

ok the one for record fields is a bit more complicated

matklad (Jun 13 2021 at 19:06, on Zulip):

Diagnostic sink is gone: https://github.com/rust-analyzer/rust-analyzer/pull/9256

matklad (Jun 15 2021 at 14:20, on Zulip):

I've moved diagnostics to a separate crate, I think these all "long overdue" changes I want to make in this area

Last update: Jul 28 2021 at 04:45UTC