Stream: t-compiler/rust-analyzer

Topic: SyntaxRewriter is accidentally linear


matklad (Dec 11 2020 at 17:09, on Zulip):

@Lukas Wirth this is a fun one (and utterly embarrassing, to be honest).

So, I've recently fixed O(N^2) behavior in SyntaxRewriter... But it is still kinda slow. I think the reason for this slowness is that it rebuilds the whole tree (we always allocate a new GreenNode, we don't try to re-use the old one). Aaaaand, even if we fix it, SyntaxRewriter will still be linear in the size of the file -- it traverses each and every node.

This actually defeats the purpose of our elaborate syntax trees, whose functional design is geared towards cheap O(height) modification. And the original C#'s SyntaxRewriter has the same issue. I guess we need to find a new abstraction for this...

Lukas Wirth (Dec 11 2020 at 17:14, on Zulip):

Ye, I noticed that we clone the entire tree as well(I think I wrote that down in the issues we were talking about the slowness as well somewhere).
I also looked at how roslyn does it but I couldn't really get behind their node tracking stuff they are using, but I guess that doesn't change the complexity either then for them.

matklad (Dec 11 2020 at 17:16, on Zulip):

yeah, I've noticied this just today....

matklad (Dec 11 2020 at 17:16, on Zulip):

otoh, I am now inspired to maybe make the tree mutable&clonnable....

Lukas Wirth (Dec 11 2020 at 17:19, on Zulip):

Won't that bring its fair share of problems as well though?

matklad (Dec 11 2020 at 17:20, on Zulip):

Most definitely

matklad (Dec 11 2020 at 17:20, on Zulip):

But I want to try this angle as well...

matklad (Dec 11 2020 at 17:25, on Zulip):

Yeah, just optimizing allocs doesn't help

Lukas Wirth (Dec 11 2020 at 17:26, on Zulip):

One Idea I tried out with SyntaxRewriter was to track what branches in a tree we only have to edit. As in whenever we add a new insert/replace/deletion to the Rewriter track down all the ancestors of the affected node so that we could skip unaffected branches entirely when rewriting.

Lukas Wirth (Dec 11 2020 at 17:26, on Zulip):

But that didn't work out since some nodes where disjoint from another that were rewritten or something along those lines iirc

Lukas Wirth (Dec 11 2020 at 17:28, on Zulip):

I'm unsure whether this idea could work out in the end as I might be missing something crucial that prevents that from working entirely

Lukas Wirth (Dec 11 2020 at 18:40, on Zulip):

Right I remember why that didn't work, since the root of the node thats passed into rewriting isnt necessarily a root of the edits

matklad (Dec 12 2020 at 15:36, on Zulip):

Ok, I am officially excited. I think we can have mutable immutable syntax trees :D

matklad (Dec 12 2020 at 15:36, on Zulip):

Will try to write an issue which describes high-level design later today

matklad (Dec 12 2020 at 15:38, on Zulip):

Couple of core insights:

matklad (Dec 12 2020 at 19:31, on Zulip):

Will try to write an issue which describes high-level design later today

Argh, I had a beautiful writeup in the unposted issue, and the browser ate the contents of the text field. Never write longer text in a browser

matklad (Dec 13 2020 at 14:19, on Zulip):

Ok, here it is: https://github.com/rust-analyzer/rust-analyzer/issues/6857

cc @Christopher Durham

Joshua Nelson (Dec 13 2020 at 14:20, on Zulip):

Mad Scientist Mode: ON

:joy:

matklad (Dec 13 2020 at 14:23, on Zulip):

@Joshua Nelson updated issue description

Joshua Nelson (Dec 13 2020 at 14:23, on Zulip):

Incredible

Joshua Nelson (Dec 13 2020 at 14:24, on Zulip):

How I Learned to Stop Worrying and Love Mutation

Lukas Wirth (Dec 13 2020 at 14:26, on Zulip):

So this is what you meant with mutable immutable :big_smile:

matklad (Dec 13 2020 at 14:28, on Zulip):

How I Learned to Stop Worrying and Love Mutation

OMG, we should use this instead of "empowering everyone to build fast and reliable software"

Joshua Nelson (Dec 13 2020 at 14:30, on Zulip):

@Steve Klabnik we need our best people on this

Steve Klabnik (Dec 13 2020 at 14:31, on Zulip):

Hehehe

Christopher Durham (Dec 13 2020 at 20:11, on Zulip):

This is definitely something right up my alley and something I'd love to do, but: legal ownership of code I write right now is murky at best, and I don't want to saddle r-a with that. I'm planning to try to get an exception for Rust OSS contributions, but grad school is on break until January and I don't want to bother them on break.

Christopher Durham (Dec 13 2020 at 20:15, on Zulip):

(more context on my GH profile readme if you want a bit more)

Poliorcetics (Dec 13 2020 at 22:58, on Zulip):

Wow your university is either evil or has been burned badly in the past to have such a strict requirement

Christopher Durham (Dec 13 2020 at 23:13, on Zulip):

@Poliorcetics mostly it's me being overly cautious (informal policy is that stuff done on your own time with your own resources is still yours, that's just not what the actual IP Agreement doc says). The reasoning they gave for it is that it simplifies knowing who actually owns what you make, and thus transferring full commercial rights to it to you if you ask for it.

I still think there could be a better middle ground, but the "all IP while you're employed [or enrolled] is the company's [or university's]" is common in the games industry :upside_down:

Christopher Durham (Dec 13 2020 at 23:14, on Zulip):

Informally I'm sure they'd be willing to give me a written doc exception for something clearly divorced from what we're doing there, it's just I've not gone through the effort yet, and I don't want to dump that question mark on someone else's project.

Christopher Durham (Dec 13 2020 at 23:19, on Zulip):

Because I don't want to speak ill of them without some context:
Actual legal clause: https://twitter.com/CAD97_/status/1296118208034996224?s=19
Later notes: https://twitter.com/CAD97_/status/1322633440815185921?s=19
Of course IANAL, I'm just overly cautious when other people's projects are on the line.

Christopher Durham (Dec 13 2020 at 23:19, on Zulip):

(ok not going to rant here anymore)

matklad (Dec 14 2020 at 09:41, on Zulip):

@Christopher Durham no worries, I am tagging you just so you are aware of the work in the area, as I expect you'll be interested in knowning what's going on :D

matklad (Dec 14 2020 at 16:31, on Zulip):

:tada: smoke test passes in a toy impl!

matklad (Dec 14 2020 at 16:31, on Zulip):

https://github.com/rust-analyzer/mini-rowan/blob/842cf1d8c7ebcc1ebed65f03b78402df3804b921/tests/it.rs#L31-L42

matklad (Dec 14 2020 at 16:33, on Zulip):

Naming Q: how would you name a method of a tree node, which removes the node from the tree? n.name_me()?

After this call, the original tree doesn't have n, and n is a root of a new tree

matklad (Dec 14 2020 at 16:33, on Zulip):

Options: remove, delete, detach, unlink

Poliorcetics (Dec 14 2020 at 16:34, on Zulip):

Detach or unlink

Poliorcetics (Dec 14 2020 at 16:34, on Zulip):

Or more verbose: to_new_root

Florian Diebold (Dec 14 2020 at 16:35, on Zulip):

I like detach

Lukas Wirth (Dec 14 2020 at 16:35, on Zulip):

^

matklad (Dec 15 2020 at 10:53, on Zulip):

Ok, "move bound to where clause" test now works: https://github.com/rust-analyzer/mini-rowan/blob/4392883f3ef67f3cf5af8334307e067c356f47f0/tests/it.rs#L32-L49

matklad (Dec 15 2020 at 10:55, on Zulip):

I'll try to extend the prototype to also track text ranges of nodes. I also want to explore whether we can retain fn token_text(&self) -> &self signature

matklad (Dec 15 2020 at 10:55, on Zulip):

might not be possible with mutable nodes :sad:

matklad (Dec 16 2020 at 13:51, on Zulip):

Pushed update with a fuller API with node/token split and text_size

matklad (Jan 12 2021 at 16:38, on Zulip):

Current status (aka a hundred segfaults later):

image.png

image.png

matklad (Jan 12 2021 at 16:38, on Zulip):

the stuff is horrifyingly unsafe inside :(

matklad (Jan 12 2021 at 16:51, on Zulip):

Benches also regress a bit, but I believe that is livable (and there are a couple optimizations we could employ to get some perf back)

matklad (Jan 12 2021 at 16:52, on Zulip):
Database loaded:     399.46ms, 287minstr, 82mb
  crates: 36, mods: 583, decls: 11311, fns: 8839
Item Collection:     8.58s, 84ginstr, 463mb
  exprs: 243254, ??ty: 2656 (1%), ?ty: 2074 (0%), !ty: 890
Inference:           16.21s, 135ginstr, 637mb
Total:               24.79s, 220ginstr, 1101mb


wip
Database loaded:     394.35ms, 287minstr, 82mb
  crates: 36, mods: 583, decls: 11311, fns: 8839
Item Collection:     10.69s, 109ginstr, 524mb
  exprs: 243254, ??ty: 2656 (1%), ?ty: 2074 (0%), !ty: 890
Inference:           17.13s, 150ginstr, 665mb
Total:               27.82s, 259ginstr, 1190mb
Laurențiu (Jan 12 2021 at 18:03, on Zulip):

Did it get slower because of the struct size increase?

matklad (Jan 12 2021 at 18:04, on Zulip):

I think that would be part of that

matklad (Jan 12 2021 at 18:05, on Zulip):

another part is that we now iterate calling next_sibling repeatedly, rather than itering a slice of green nodes

matklad (Jan 12 2021 at 18:06, on Zulip):

(this is the bit I believe we can speed up quite a bit)

matklad (Jan 12 2021 at 18:07, on Zulip):

And there's a bit of Cellgets/sets on the immutable path, they migh cause issues

Laurențiu (Jan 12 2021 at 18:08, on Zulip):

// The following links only have meaning when state is Mut``

Are they also used/set/kept track of when it's Imm?

Laurențiu (Jan 12 2021 at 18:12, on Zulip):

I mean, I guess they can't be moved to Imm

Laurențiu (Jan 12 2021 at 18:12, on Zulip):

(triomphe#hash isn't published yet)

matklad (Jan 13 2021 at 10:31, on Zulip):

triomphe is published I think?

Laurențiu (Jan 13 2021 at 11:05, on Zulip):

No, see https://github.com/rust-analyzer/rowan/blob/clone-for-update/Cargo.toml#L14

Laurențiu (Jan 13 2021 at 11:05, on Zulip):

The hash branch isn't merged

Laurențiu (Jan 13 2021 at 11:08, on Zulip):

Ah, Cargo.toml needs to point to crates.io

matklad (Jan 27 2021 at 15:42, on Zulip):

Current status: leaking so much memory that it overflows allocator's counter: Item Collection: 8.06s, 83ginstr, -367mb

matklad (Feb 04 2021 at 16:58, on Zulip):

The API is coming along nicely: https://github.com/rust-analyzer/rust-analyzer/pull/7498/files#diff-92a680dd91492725f6dc97832ef92847fb2027669f6b9a56e2c724c3e8f4df9cR44-R71

matklad (Feb 04 2021 at 16:58, on Zulip):

(although I did spend half of the day hunting for missing refcount increment)

Last update: Jul 26 2021 at 13:30UTC