Stream: t-compiler/rust-analyzer

Topic: Desugaring

matklad (Nov 16 2019 at 14:09, on Zulip):

Stream for

matklad (Nov 16 2019 at 14:10, on Zulip):

@cjgillot yep, what you've written in the issue is correct!

matklad (Nov 16 2019 at 14:11, on Zulip):

One thing to add though, is that there are two paths for CST -> AST (or ast -> hir, if using, rust-analyzer terminlogy (which is confusing :( ))

matklad (Nov 16 2019 at 14:12, on Zulip):

The first path is when we do lowering: we start with function/static/const, and recursively lower every expression

matklad (Nov 16 2019 at 14:13, on Zulip):

There's a second path thogh: SourceAnalyzer::expr_id. This is invoked by an IDE. IDE has a CST for a specific expression, and it wants to get AST for this expression (as opposed to AST for the whole function)

matklad (Nov 16 2019 at 14:13, on Zulip):

That's why, when lowring, we maintain both forward and backward maps

matklad (Nov 16 2019 at 14:14, on Zulip):

And the problem is indeed that these maps miss some entries due to the way desugaring work, and in general feel imprecise.

I don't know, maybe the solution here is just to refactor these bits and put behind nicer interfaces (like removing Either's), but maybe it's something bigger

cjgillot (Nov 16 2019 at 14:21, on Zulip):

Ok, I start to grasp a bit the issue. A first way forward would be to have all the lowering in one place. Then, we will have to ensure through typing that all the hir nodes have an ast counterpart. These two may lead to a significant step forward.

matklad (Nov 16 2019 at 14:24, on Zulip):

Looks good to me!

I think we might also want to define some type which describes the lowring. Like, ast <-> hir map should be 1-to-many, so that we can say "this if let expression got lowered to this bunch of expressions".

matklad (Nov 16 2019 at 14:29, on Zulip):

Or, and another complication is that there may be different types on the ast side (that's where the Eithers come from)

cjgillot (Nov 18 2019 at 21:38, on Zulip):

Hi @matklad. I have an issue about the missing AST nodes. When there are missing constructors AST blocks, we have to invent a Missing HIR node. That node can not be mapped to an AST node. How should these be taken into account?

matklad (Nov 19 2019 at 07:54, on Zulip):

Good question!

I think it makes sense that this node does not have a corresponding mapping back to the source. If you want to report an error for an expression, but the expr is a synthesized Missing, you actually shouldn't be able to report an error.

matklad (Nov 19 2019 at 07:55, on Zulip):

That seems like the signature should be like -> Result<SyntaxNode, ExprMissingError>, and we might, or might not want to try to express this in the type system

matklad (Nov 19 2019 at 07:56, on Zulip):

(this: the fact that only non-missing expressions have sources)

cjgillot (Nov 22 2019 at 17:43, on Zulip):

I have an issue with the synthetic HIR nodes that come from desugaring. A possibility is to attach those to the containing expression. This would cause some hir::Expr nodes to back-reference to a ast::Pat node and conversely. Do we really need to preserve the typing in this direction?

matklad (Nov 22 2019 at 17:49, on Zulip):

@cjgillot could you give an example?

cjgillot (Nov 22 2019 at 18:16, on Zulip):

The desugaring of WhileLet creates 2 desugared expressions, a break and a match, as well as a pattern _.

cjgillot (Nov 22 2019 at 18:18, on Zulip):

The only AST node to which back-map the pattern is the WhileLet node, but it is an expression node, not a pattern node.

matklad (Nov 22 2019 at 18:23, on Zulip):

Aha! I guess, if we want to go fully strongly typed, we want to map it to

enum PatternSugar {
cjgillot (Nov 22 2019 at 18:29, on Zulip):

I was trying to get rid of the typing to make lowering easier, but this was should be better for error reporting.

cjgillot (Nov 24 2019 at 13:41, on Zulip):

Hi. I have some doubt about the handling of missing expression nodes. At the moment, new ExprIds and PatIds are allocated each time, with no associated information. What about changing the ExprId in the HIR to Option<ExprId> and handle missing nodes using None value? This would allow to completely remove that special case from lowering and have complete HIR->AST mapping. Are there downstream constraints against that?

matklad (Nov 24 2019 at 13:46, on Zulip):

Yep, this would unfortunatelly complicate furter analisys. Like, suppose we have something like if [missing] { then } else { otherwise }.

If the lowring produces Expr::Missing for [missing], the typechecker assigns a fresh variable to this expr, it's inferred to bool, and everything typechecks, without any explicit handling. At the same time, the error is reported at the lowering phase.

If, OTOH, we use an Option<Expr> representation, than every bit of the typechecker has to care about this

cjgillot (Nov 24 2019 at 13:53, on Zulip):

Ok. Do you have a pointer to a code block which would be strongly impacted? There may be some middle ground to reach.

matklad (Nov 24 2019 at 13:56, on Zulip):

@cjgillot that would be InferenceContext::infer_expr_inner

matklad (Nov 24 2019 at 13:58, on Zulip):

In general, it seems like upper layers would prefer Option<ExprId> repr, while lower layers benefit from Expr::Missing repr.

Ideally, the desugaring infra should be the place which cleanly separates both worlds.

cjgillot (Nov 24 2019 at 14:01, on Zulip):

In that case, infer_expr would be responsible for unwrapping the Option<ExprId>, and infer_expr_inner would barely change.

matklad (Nov 24 2019 at 14:05, on Zulip):

There would be at least some subtelities. For example, optional expressions, like the else branch of if, would become an awkward Option<Option<ExprId>>.

matklad (Nov 24 2019 at 14:06, on Zulip):

Though that can fixed by using a dedicated enum instead of inner options

matklad (Nov 24 2019 at 14:07, on Zulip):

And I think that approach has some merit: in particular, it would be impossible to report errors for missing expressions...

matklad (Nov 24 2019 at 14:07, on Zulip):


matklad (Nov 24 2019 at 14:07, on Zulip):

lark used an interesting type for similar purposes: Result<T, ErrorReported>

Last update: Jul 24 2021 at 20:30UTC