Stream: t-compiler/wg-rls-2.0

Topic: Crazy optimization idea -- skipping the parse tree altogethe


matklad (Jul 17 2020 at 13:15, on Zulip):

We are at the point where parsing takes substantional amount of time. I think the main thing we need a parse tree for is the item tree.

I wonder if we should think about producing it without parse tree at all... It seems approximately feasible to use evented parsing for this purpose, instead of a true syntax tree

Laurențiu Nicola (Jul 17 2020 at 13:31, on Zulip):

Can the MBE stuff be disabled? I wonder how much faster parsing would get.

matklad (Jul 17 2020 at 13:32, on Zulip):

I think only by commenting it out

Laurențiu Nicola (Jul 17 2020 at 13:33, on Zulip):

But where? :D

matklad (Jul 17 2020 at 13:33, on Zulip):

That I don't know without looking :)

Laurențiu Nicola (Jul 17 2020 at 13:36, on Zulip):

Yeah, it's 10s vs. 18s, so not that much of a win

Laurențiu Nicola (Jul 17 2020 at 13:38, on Zulip):

OTOH, we don't have that many macros (besides expect! and match_ast!), so it's still a considerable speedup

Laurențiu Nicola (Jul 17 2020 at 13:39, on Zulip):

(I stubbed out mbe_expander::expand_rules)

matklad (Jul 17 2020 at 13:43, on Zulip):

hm?

matklad (Jul 17 2020 at 13:43, on Zulip):

It's almost twice as fast?

Laurențiu Nicola (Jul 17 2020 at 13:43, on Zulip):

Yeah, something like that

matklad (Jul 17 2020 at 13:44, on Zulip):

that feels like very much of a win :D

Laurențiu Nicola (Jul 17 2020 at 13:44, on Zulip):

Meh, not even an order of magnitude :-)

Laurențiu Nicola (Jul 17 2020 at 13:45, on Zulip):

On one hand, according to the argument you often make, skipping bodies would make us less likely to optimize the parser itself

Laurențiu Nicola (Jul 17 2020 at 13:45, on Zulip):

On the other hand, analysis-stats still has to parse the bodies, so it's not like we'd no longer know how slow it is

Laurențiu Nicola (Jul 17 2020 at 13:46, on Zulip):

On the third hand, I think you wanted diagnostics support for files that aren't open, so we'd still need the bodies for that

matklad (Jul 17 2020 at 13:46, on Zulip):

THe number of files in the current crate is far fewer than the total number of files

matklad (Jul 17 2020 at 13:47, on Zulip):

Like, bodies in libstd are completely uninteresting

Laurențiu Nicola (Jul 17 2020 at 13:47, on Zulip):

Oh, right

Laurențiu Nicola (Jul 17 2020 at 13:59, on Zulip):

Filed https://github.com/rust-analyzer/rust-analyzer/issues/5426 for a small thing, if it makes sense

Jonas Schievink (Jul 17 2020 at 14:05, on Zulip):

Does the parser have an event-based interface already or does it construct the syntax tree directly?

matklad (Jul 17 2020 at 14:06, on Zulip):

It does have an event-based interface

matklad (Jul 17 2020 at 14:06, on Zulip):

(see TreeSink trait)

Christopher Durham (Jul 17 2020 at 20:14, on Zulip):

I'm doing planning for specific support for delayed parsing in rowan/sorbus, so here's a quick question:

Does the syntax tree need to hold on to the string contents of the skipped body (to facilitate parsing it later, on-demand), or is sticking a thunk of "skipped N bytes" enough?

Christopher Durham (Jul 17 2020 at 20:16, on Zulip):

I'd love (at a mem-usage level, less so the implementation level :stuck_out_tongue:) to be able to skip storing these huge unparsed strings, if we can always just refer back to a "whole file" view to do the full parse later

Christopher Durham (Jul 17 2020 at 20:16, on Zulip):

Or if there are situations where we know with high probability that we won't need to parse the item bodies later

Christopher Durham (Jul 17 2020 at 21:03, on Zulip):

(actually, no, the implementation of text-free thunks was a lot simpler than I thought it was going to be.)

Christopher Durham (Jul 17 2020 at 21:04, on Zulip):

It does have cost, though, so I definitely don't want to merge it unless r-a has a chance of using it.

matklad (Jul 17 2020 at 21:05, on Zulip):

I think "tree is a value" is a very valuable design choice.

matklad (Jul 17 2020 at 21:06, on Zulip):

So, I'd say sorbus should store just Box<FnOnce() -> GreenNode> as a thunk

matklad (Jul 17 2020 at 21:06, on Zulip):

THat way, client can store text, store tokens, remmeber the context like edition, etc

Christopher Durham (Jul 17 2020 at 21:13, on Zulip):

That does seem optimal, it's just that sticking a Box<dyn FnOnce> into the tree is difficult since its not erasable, and then synchronization issues as well

Christopher Durham (Jul 17 2020 at 21:15, on Zulip):

As far as I can tell so far it's not possible for sorbus (as currently architectured) to do an ArcSwap-like trick to mostly transparently swap out a thunk for a real value

Why? Because we're doing pointer un/erasure (on unsized types) and pointer mangling, which is definitely _not_ synchronized

Christopher Durham (Jul 17 2020 at 21:17, on Zulip):

We can just stick in an extra layer of indirection, but Box<Box<dyn Fn>> feels wrong for a reasonable reason

Last update: Sep 27 2020 at 14:45UTC