Stream: t-compiler/wg-rls-2.0

Topic: RLS 1.0 -- RLS 2.0 bridge


matklad (Apr 17 2019 at 16:12, on Zulip):

ping @Igor Matuszewski

matklad (Apr 17 2019 at 16:19, on Zulip):

So, I think that we can make a relatively easy bridge between current architecture of RLS and RLS2

matklad (Apr 17 2019 at 16:19, on Zulip):

The problem is that it'll work only for current arch which works via save-analysis

matklad (Apr 17 2019 at 16:20, on Zulip):

And it wouldn't be transferable to querying compiler's front-end bits

matklad (Apr 17 2019 at 16:20, on Zulip):

On the high-level, current RLS does two things:

matklad (Apr 17 2019 at 16:21, on Zulip):

The reason why we can't just plug it into rust-analyzer is that RLS uses #[rustc_private], and I believe that RA should be built on stable

Igor Matuszewski (Apr 17 2019 at 16:22, on Zulip):

so we might use something out of process to generate JSON blobs

Igor Matuszewski (Apr 17 2019 at 16:22, on Zulip):

which we can then load and lower using stable toolchain

Igor Matuszewski (Apr 17 2019 at 16:22, on Zulip):

is that what you're suggesting?

matklad (Apr 17 2019 at 16:22, on Zulip):

However, if rustc is not in process, and we communicate with it via IPC, that could work

matklad (Apr 17 2019 at 16:22, on Zulip):

yeah, precisely

Igor Matuszewski (Apr 17 2019 at 16:22, on Zulip):

so essentially 'stabilize' (as in make it build on stable) rls-data/rls-analysis and work from there

matklad (Apr 17 2019 at 16:23, on Zulip):

sort of

Igor Matuszewski (Apr 17 2019 at 16:23, on Zulip):

there's a small caveat - we use rustc_serialize

matklad (Apr 17 2019 at 16:23, on Zulip):

rustc_serialize is totally usable from stable

Igor Matuszewski (Apr 17 2019 at 16:23, on Zulip):

and rustc_serialize and libserialize on crates.io diverged slightly, especially wrt PathBuf handling

matklad (Apr 17 2019 at 16:23, on Zulip):

or was it completely removed?

Igor Matuszewski (Apr 17 2019 at 16:24, on Zulip):

hm, need to make sure if we use libserialize in librustc_save_analysis

Igor Matuszewski (Apr 17 2019 at 16:24, on Zulip):

ah hm so we're using crates.io rustc-serialize it seems inside the librustc_save_analysis

Igor Matuszewski (Apr 17 2019 at 16:24, on Zulip):

so there's hope IIUC!

matklad (Apr 17 2019 at 16:25, on Zulip):

well, at work, we can read json to string and thrown regexes onto it /s

matklad (Apr 17 2019 at 16:25, on Zulip):

that is, this seems surmountable

Igor Matuszewski (Apr 17 2019 at 16:26, on Zulip):

but fair, assuming we get into a somewhat stable deserializable data format for SA, I believe we're good to go

matklad (Apr 17 2019 at 16:26, on Zulip):

The bigger problem here is that we should make the data format itself stable

matklad (Apr 17 2019 at 16:26, on Zulip):

Like, it should be at least de-facto stable, such that different nightlies work

Igor Matuszewski (Apr 17 2019 at 16:26, on Zulip):

tbf I think we should just version the format with Rust minor edition and make it not backwards-compatible

matklad (Apr 17 2019 at 16:26, on Zulip):

(we can future proof here by making everything optional, etc, protobuf style)

Igor Matuszewski (Apr 17 2019 at 16:26, on Zulip):

hm

matklad (Apr 17 2019 at 16:27, on Zulip):

and yeah, version numbers totally would help here

matklad (Apr 17 2019 at 16:27, on Zulip):

However, I am not sure if we want to make it formally stable

matklad (Apr 17 2019 at 16:28, on Zulip):

Like, if people depend on it, we'll have to maintain it forever, etc

Igor Matuszewski (Apr 17 2019 at 16:28, on Zulip):

no I don't think so

matklad (Apr 17 2019 at 16:28, on Zulip):

I think it's ok to make semver-exeption in this case and require some awful flags/env-vars to activate this, a-la RUSTC_BOOTSTRAP

Igor Matuszewski (Apr 17 2019 at 16:28, on Zulip):

if anything, something more akin to LSIF should be 'stable'

Igor Matuszewski (Apr 17 2019 at 16:29, on Zulip):

then there's the question of memory usage

Igor Matuszewski (Apr 17 2019 at 16:29, on Zulip):

IIRC you're fighting with it

Igor Matuszewski (Apr 17 2019 at 16:29, on Zulip):

and... it's not going to get better if we decide to plug in SA on top of RLS 2.0

matklad (Apr 17 2019 at 16:29, on Zulip):

yeah, that's my point: we do want some form of save-analysis long-term: it'll be hugely useful for non-interactive usages, but we don't want current form

matklad (Apr 17 2019 at 16:30, on Zulip):

Is SA that big?

matklad (Apr 17 2019 at 16:30, on Zulip):

I feel that by implementing SourceManager, we should be able to drastically cut the memory usage of analyzer

Igor Matuszewski (Apr 17 2019 at 16:30, on Zulip):

well for one I believe rustc_serialization takes ~35 megs of json blob for primary package jsons

Igor Matuszewski (Apr 17 2019 at 16:31, on Zulip):

which is transformed into 6-10 megs worth of hash map data - can't recall if that's only for primary crates or for the entirety of rls-analysis

matklad (Apr 17 2019 at 16:31, on Zulip):

and I feel like we could have a flag for enabling save-analysis, on by default. SO you'll get good experience for small projects, but, for bigger proejcts and smaller ram, you can live only with analyzer parts

Igor Matuszewski (Apr 17 2019 at 16:31, on Zulip):

this is just order of magnitude measure, would have to measure precisely again what's done what

matklad (Apr 17 2019 at 16:32, on Zulip):

heh, 35megs is peanuts, in ra we have GIGs of trees ;(

Igor Matuszewski (Apr 17 2019 at 16:32, on Zulip):

:point_up: I think that would be ideal

Igor Matuszewski (Apr 17 2019 at 16:32, on Zulip):

damn that's... a lot

Igor Matuszewski (Apr 17 2019 at 16:32, on Zulip):

wouldn't have expected that much

matklad (Apr 17 2019 at 16:32, on Zulip):
workspaces:
411 packages loaded

analysis:
28122 (106mb) files
131239 (11mb) symbols
1175 trees, 80 (43mb) retained


memory:
342mb allocated 1499mb resident
gc 852 seconds ago
Igor Matuszewski (Apr 17 2019 at 16:32, on Zulip):

is it due to expansion? do we hold separate instance per expansion pass for macros?

matklad (Apr 17 2019 at 16:32, on Zulip):

well, actually it did get better after gc and rowan work

matklad (Apr 17 2019 at 16:32, on Zulip):

this is from rustc

matklad (Apr 17 2019 at 16:33, on Zulip):

is it due to expansion?

I think this is mostly due to poor GC for syntax trees. We never need to keep many syntax trees in memory simultaneously, but currently we do ;(

matklad (Apr 17 2019 at 16:34, on Zulip):

And yeah, limiting memory usage is an absolute blocker for any real release of ra

matklad (Apr 17 2019 at 16:34, on Zulip):

So, I think we can deal with quirky JSON and megabyes of data

matklad (Apr 17 2019 at 16:35, on Zulip):

The stability issue is what worries me most

matklad (Apr 17 2019 at 16:35, on Zulip):

Do you think it would be feasible to make save-analysis data "stable, but only for RLS"? struct rustc { friend class RLS; }; basically

matklad (Apr 17 2019 at 16:41, on Zulip):

The same "stability loophole" would be required for IPC VFS bit as well.

matklad (Apr 17 2019 at 16:42, on Zulip):

Hm, otoh, for the poc, we can totally restrict ourselves to nightly, to just test out if it work

matklad (Apr 17 2019 at 16:43, on Zulip):

that is, save-analysis bits would be available only on nightly rustc, and we'll use -Z flags to access JSON & VFS

Igor Matuszewski (Apr 17 2019 at 16:48, on Zulip):

@matklad that's how we currently work around this in stable toolchain

Igor Matuszewski (Apr 17 2019 at 16:49, on Zulip):

we have a rls-rustc shim which is able to run on stable but can emit save-analysis

Igor Matuszewski (Apr 17 2019 at 16:49, on Zulip):

as in, it doesn't require -Z flag

Igor Matuszewski (Apr 17 2019 at 16:49, on Zulip):

for one I'd like to stop using rustc_serialize for save-analysis as a cleanup

Igor Matuszewski (Apr 17 2019 at 16:50, on Zulip):

IIRC there was a problem with serde being in the sysroot which cause some problems, don't know the details

Igor Matuszewski (Apr 17 2019 at 16:50, on Zulip):

https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/serde.20in.20the.20compiler

matklad (Apr 17 2019 at 16:51, on Zulip):

yeah, purging rustc_serialize from rustc would be awesome

Igor Matuszewski (Apr 17 2019 at 16:51, on Zulip):

If we have an easily serde-deserializable data we can then toy with making it work on stable and so forth

Igor Matuszewski (Apr 17 2019 at 16:51, on Zulip):

I don't think that's feasible since there are some edge cases (metadata serialization) where it'd require more work to make serde support a very niche case

Igor Matuszewski (Apr 17 2019 at 16:52, on Zulip):

but at least for JSON (de)serialization we should definitely switch to serde

matklad (Apr 17 2019 at 16:53, on Zulip):

I guess pigging-in serde should help with perf in that we could switch from JSON to bincode

matklad (Apr 17 2019 at 16:53, on Zulip):

to save memory

matklad (Apr 17 2019 at 16:54, on Zulip):

might be useful for "stability" as well: it's much easier to use JSON accidentally, than with some binary blob of whatever

Igor Matuszewski (Apr 17 2019 at 16:55, on Zulip):

using binary format is also high on the list of things to experiment with for RLS 1.0 so that'd be good as well

matklad (Apr 17 2019 at 16:56, on Zulip):

I'd like to take a look at the higher-level picture here

matklad (Apr 17 2019 at 16:56, on Zulip):

moving rustc out of RLS process does seem like a step backwards

matklad (Apr 17 2019 at 16:57, on Zulip):

However it probably isn't: save-analysis just looks like a less than ideal interface between compiler and IDE, and hiding it behind a query, as far as I understand, won't change the picture drastically

Igor Matuszewski (Apr 17 2019 at 16:58, on Zulip):

yes, we talked about this with Niko as well

Igor Matuszewski (Apr 17 2019 at 16:58, on Zulip):

I believe that we should pursue this as a next step, at least for the 1.0 version

matklad (Apr 17 2019 at 16:59, on Zulip):

This = ?

Igor Matuszewski (Apr 17 2019 at 16:59, on Zulip):

moving rustc out of RLS process

Igor Matuszewski (Apr 17 2019 at 17:00, on Zulip):

and just preparing to serve save-analysis or some data back from this analysis rustc invocation

matklad (Apr 17 2019 at 17:01, on Zulip):

so, it seems like we should just do it?

matklad (Apr 17 2019 at 17:01, on Zulip):

1 & 2 seems like no-brainer wins

matklad (Apr 17 2019 at 17:03, on Zulip):

one bit I am worried about is perf... But looks like IPC overhead is just negligible in comparison to actual data processing?

Igor Matuszewski (Apr 17 2019 at 17:03, on Zulip):

again, we can't really kill rustc_serialize as it will be used in rustc for metadata serialization from what I know

matklad (Apr 17 2019 at 17:03, on Zulip):

actually, I was pleasantly surprised by just how fast cargo-watch in ra turns out in practice, especially after @_Edwin Cheng 's refactoring

Igor Matuszewski (Apr 17 2019 at 17:04, on Zulip):

but it'd be nice to just push it into a corner and use it only for the niche case needed by rustc

matklad (Apr 17 2019 at 17:04, on Zulip):

Hm, curious, why it is needed for metadata serialization?

matklad (Apr 17 2019 at 17:04, on Zulip):

the format is unstable, so we can change it to something more easily serdable?

Igor Matuszewski (Apr 17 2019 at 17:04, on Zulip):

not really sure, I'm sure eddy knows more details

Igor Matuszewski (Apr 17 2019 at 17:05, on Zulip):

but it came up once or twice

Igor Matuszewski (Apr 17 2019 at 17:05, on Zulip):

seemed like it'd be more work to shoehorn serde's design to solve rustc's problem (which might've changed now? this discussion was surely more than half a year ago)

Igor Matuszewski (Apr 17 2019 at 17:05, on Zulip):

but definitely, I think we should converge somehow with this :thumbs_up:

matklad (Apr 17 2019 at 17:06, on Zulip):

I guess, we can wrap up the discussion?

Igor Matuszewski (Apr 17 2019 at 17:07, on Zulip):

yeah

Igor Matuszewski (Apr 17 2019 at 17:07, on Zulip):

as you're saying, I believe IPC overhead is negligible

Igor Matuszewski (Apr 17 2019 at 17:07, on Zulip):

I say we do it and measure overhead then, if any

matklad (Apr 17 2019 at 17:07, on Zulip):

sgmt!

Igor Matuszewski (Apr 17 2019 at 17:07, on Zulip):

however I believe we can actually profit off this if we switch to binary serialization

matklad (Apr 17 2019 at 17:09, on Zulip):

yeah, that was my though as well: in general, by teasing apart things and creating boundaries, you can easier optimize stuff!

One hope I have for my lexer work is to (in the far-future) generate the lexer as a giant utf-8-decoding state machine. Should be much easier once lexer is independent of ParseSess and the rest of the jungle :D

Igor Matuszewski (Apr 17 2019 at 17:09, on Zulip):

@matklad I'm really sorry for constantly putting off work for 2.0 - let me work on this first, seeing as there are no people who know their way around save-analysis, short of nrc :smiley:

Igor Matuszewski (Apr 17 2019 at 17:09, on Zulip):

haha, yeah

Igor Matuszewski (Apr 17 2019 at 17:09, on Zulip):

I'm really interested in the lexer-as-a-library, please let me know if I can help

Igor Matuszewski (Apr 17 2019 at 17:10, on Zulip):

I'd really like to start librarifying things and splitting them off or even sharing, soon :tada:

matklad (Apr 17 2019 at 17:10, on Zulip):

yeah, that PR about toolstate is sad :(

I guess I might be able to put more work into the overall tooling after my Rust course ends

matklad (Apr 17 2019 at 17:11, on Zulip):

JFYI, the lexer work is here: https://github.com/rust-lang/rust/pull/59706

Igor Matuszewski (Apr 17 2019 at 17:12, on Zulip):

oh, this is actually closer than I thought, from what I can understand from the comments

Igor Matuszewski (Apr 17 2019 at 17:12, on Zulip):

nice! :tada:

Igor Matuszewski (Apr 17 2019 at 17:17, on Zulip):

Serde in the compiler discussion: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/serde.20in.20the.20compiler/near/163581456

Jeremy Kolb (Apr 17 2019 at 21:37, on Zulip):

What's the advantage of rust-analyzer using save analysis?

Igor Matuszewski (Apr 17 2019 at 22:20, on Zulip):

The data comes from the compiler so it should be more complete

Igor Matuszewski (Apr 17 2019 at 22:21, on Zulip):

This is nice because in theory it should handle (proc) macro expansions and whatnot

Igor Matuszewski (Apr 17 2019 at 22:22, on Zulip):

and by bridging the gap I think we get a more featureful(?) product and the work can improve the save-analysis backend which is important in itself for potential code indexer usage or in the future to make rustc handle lazy/on-demand work better

Igor Matuszewski (Apr 17 2019 at 22:22, on Zulip):

@matklad https://github.com/rust-lang/rust/pull/60053 it seems that... serde just works? this would be terrific

Nick Lawrence (Apr 17 2019 at 22:32, on Zulip):

for newer rust-folk, whats a more laymans explanation of the benefits?

matklad (Apr 18 2019 at 07:54, on Zulip):

Rust analyzer currently works by re-implementing part of the compiler like parsing and type inference. Save-analysis works by asking, e.g, type inference info from the existing rustc. First approach is faster and scales better (due to careful choice of ide-oriented high-level architecture), but it handles only a fraction of the language now.

Laurențiu Nicola (Apr 18 2019 at 13:17, on Zulip):

so is this like a temporary thing until rls 2.0 can share code with the compiler for macro expansion, name resolution and so on?

matklad (Apr 18 2019 at 13:19, on Zulip):

yeah!

Laurențiu Nicola (Apr 18 2019 at 13:21, on Zulip):

and even if it's able to read the analysis files, rust analyzer doesn't know what to do with them?

Laurențiu Nicola (Apr 18 2019 at 13:21, on Zulip):

right now, at least

Last update: Nov 12 2019 at 15:30UTC