Stream: t-compiler/wg-rls-2.0

Topic: Performance


Ryan Levick (Jun 05 2020 at 16:24, on Zulip):

Hey all! First, thanks for the awesome work! :heart:️ I'm working on the winrt crate which is a crate for Windows Runtime bindings (think modern Windows APIs). These APIs are huge and so it's pretty easy for bindings to reach the hundreds of thousands or even millions of lines of code. Needless to say, it's a perf challenge. I'm wondering if it would be helpful to have our crate as a performance measure. You can give it a try with a simple scratch repo we set up: https://github.com/kennykerr/scratch there shouldn't be anything special to run it, just a Windows 10 box.

matklad (Jun 05 2020 at 16:26, on Zulip):

Could you also create a repo with all the code generated? Might be much easier to poke at if win 10 box is not required :)

Ryan Levick (Jun 05 2020 at 16:27, on Zulip):

Sure. You won't be able to run it, but I guess that's not an issue

matklad (Jun 05 2020 at 16:29, on Zulip):

Fun fact: IntelliJ Rust had (and I bet still has) special hacks in resolver to handle winapi crate

matklad (Jun 05 2020 at 16:29, on Zulip):

yup, still there: https://github.com/intellij-rust/intellij-rust/blob/5a3d790d4d981c0f5a03761a340d7add69702e66/src/main/kotlin/org/rust/lang/core/resolve/Processors.kt#L38-L42

Ryan Levick (Jun 05 2020 at 16:36, on Zulip):

@matklad https://github.com/rylev/winrt-generated

Ryan Levick (Jun 05 2020 at 16:39, on Zulip):

Interesting note. the generated bindings seems faster than when going through a different create through an include macro

Ryan Levick (Jun 05 2020 at 16:46, on Zulip):

Is there perhaps something when using include! that might be slowing things down?

Ryan Levick (Jun 05 2020 at 17:02, on Zulip):

Moving stuff around in the file seems to cause it to wonk out though...

Ryan Levick (Jun 08 2020 at 14:14, on Zulip):

I played more with the sample repo I posted above today. Besides a long startup time, the performance is ok. It still takes over a second for some types to resolve. I also am under the impression that performance is different when the code is pre-generated and included explicitly in the project vs when the code is generated in the build script and read by an include! macro from OUT_DIR. I'm not exactly sure what the next steps should be

matklad (Jun 08 2020 at 14:17, on Zulip):

Yeah, I think it's quite probable that include! is a problem here

matklad (Jun 08 2020 at 14:17, on Zulip):

(or at least one of the problems)

matklad (Jun 08 2020 at 14:17, on Zulip):

We've fixed an accidently quadratic behavior around include! a couple of weeks ago but there's still at least one O(N^2) thing left

Ryan Levick (Jun 08 2020 at 14:19, on Zulip):

Are there tests around this? I need to look into getting rust-analyzer dev set upon my machine

Florian Diebold (Jun 08 2020 at 14:21, on Zulip):

look for infer_builtin_macros_include and include_accidentally_quadratic

Florian Diebold (Jun 08 2020 at 14:21, on Zulip):

the include_accidentally_quadratic test is actually ignored, presumably because it's still pretty slow

matklad (Jun 08 2020 at 14:21, on Zulip):

https://github.com/rust-analyzer/rust-analyzer/blob/5ed9818a7c855bf914e91324e305f24e4e743057/crates/ra_hir_ty/src/tests/macros.rs#L553-L575

Ryan Levick (Jun 08 2020 at 14:23, on Zulip):

I'll see if I can make any dent in that.

matklad (Jun 08 2020 at 14:24, on Zulip):

@Ryan Levick do you know about RA_PROFILE env var?

matklad (Jun 08 2020 at 14:25, on Zulip):

https://github.com/rust-analyzer/rust-analyzer/tree/master/docs/dev#profiling

matklad (Jun 08 2020 at 14:25, on Zulip):

Might be useful to narrow this down. I haven't looked into this myself yet, but the most probable cause is "somethig is quadratic".

Ryan Levick (Jun 08 2020 at 15:18, on Zulip):

Sorry, had to step out. No I didn't know about that (I'm basically completely new to RA dev - only been a user until now). I'll give it a look

Laurențiu Nicola (Jun 08 2020 at 15:40, on Zulip):

https://github.com/rust-analyzer/rust-analyzer/tree/master/docs/dev is worth reading, but git clone and cargo test will get you pretty far.

std::Veetaha (Jun 10 2020 at 10:35, on Zulip):

@matklad regarding the performance cc https://github.com/rust-analyzer/rust-analyzer/issues/4692#issuecomment-637076470, it's strange that codegen-units=1 doesn't increase the dist run time. If so I wonder if it really does give performance increase.

matklad (Jun 10 2020 at 10:41, on Zulip):

That's compile-time peformance, right? In general, I wouldn't tweak anything here -- at the current level of internal optimization of RA, fine-grained tweaking of compiler optimization flags should be irrelevant -- those should come after the code base itself is optimized.

I wonder if incremental=true negates codegen units?

std::Veetaha (Jun 10 2020 at 10:43, on Zulip):

Incremental=true did this untill 1.44 rust releaae

std::Veetaha (Jun 10 2020 at 10:45, on Zulip):

Setting Codegen-units=1 trades compilation in 1 thread for potentially more optimizations rustc can do.
I also wonder why we have incremental enabled for dist builds...

Laurențiu Nicola (Jun 10 2020 at 10:45, on Zulip):

We don't?

Laurențiu Nicola (Jun 10 2020 at 10:46, on Zulip):

There's CARGO_INCREMENTAL: 0 in an env section at the top of release.yaml

std::Veetaha (Jun 10 2020 at 10:46, on Zulip):

I mean there is incremental=true release section of cargo toml

Laurențiu Nicola (Jun 10 2020 at 10:47, on Zulip):

Because matklad only runs release builds locally?

std::Veetaha (Jun 10 2020 at 10:48, on Zulip):

Don't we have debug=0 in [dev] profile for this purpose?

matklad (Jun 10 2020 at 10:49, on Zulip):

debug=0 is to speed up local debug builds (debug info is huge). incremtal=1 is to speed up local release builds

std::Veetaha (Jun 10 2020 at 10:51, on Zulip):

I think CARGO_INCREMENTAL=0 in release.yml doesn't override what is specified in Cargo.toml

std::Veetaha (Jun 10 2020 at 10:51, on Zulip):

Or does it?

Laurențiu Nicola (Jun 10 2020 at 10:55, on Zulip):

It should

std::Veetaha (Jun 10 2020 at 10:56, on Zulip):

Ah, yes, the docs say so...
I wonder why I thought it doesn't

Laurențiu Nicola (Jun 10 2020 at 11:21, on Zulip):

A codegen-units=1 incremental=false build takes 14 minutes on my laptop

Laurențiu Nicola (Jun 10 2020 at 11:29, on Zulip):

And analysis takes 57.7 vs. 59.8s

Jeremy Kolb (Jun 12 2020 at 12:55, on Zulip):

I just wiped my target directory because I was running out of disk space. I now have an extra 50GB I didn't have before.

Last update: Sep 27 2020 at 13:15UTC