Stream: t-compiler

Topic: No compression of bytecode?


nnethercote (Nov 20 2019 at 21:52, on Zulip):

@Alex Crichton A while back I think you said you could be ok with bytecode not being compressed? I've been experimenting with that, it reduces compile time by up to 5% on debug builds.

nnethercote (Nov 20 2019 at 21:52, on Zulip):

I need to measure the impact on the .rlib file sizes.

Alex Crichton (Nov 20 2019 at 22:02, on Zulip):

Nice!

Alex Crichton (Nov 20 2019 at 22:02, on Zulip):

I'd personally be totally fine decompressing the bytecode

Alex Crichton (Nov 20 2019 at 22:02, on Zulip):

and/or investigating faster compression algorithms

Alex Crichton (Nov 20 2019 at 22:02, on Zulip):

something like a lower compression level or maybe even zstd might get 90% of the wins

Alex Crichton (Nov 20 2019 at 22:03, on Zulip):

I'd be fine removing the compression entirely myself

simulacrum (Nov 20 2019 at 22:06, on Zulip):

it is worth noting that a common complaint is that Rust produces enormous target directories

simulacrum (Nov 20 2019 at 22:07, on Zulip):

Though I personally think the solution there is not compression but rather storing different data (and/or "intelligent" compression) -- I am also unopposed to not compressing

simulacrum (Nov 20 2019 at 22:07, on Zulip):

it may also be plausible that we can easily compress on a separate thread?

nnethercote (Nov 20 2019 at 22:39, on Zulip):

We already use "Fast" compression (as opposed to "Best", "Default", or "None")

nnethercote (Nov 20 2019 at 22:47, on Zulip):

@Alex Crichton there are mentions of "bitcode" and "bytecode". Are these the same thing?

nnethercote (Nov 20 2019 at 23:01, on Zulip):

Hmm, a rough comparison on one project suggests that disabling compression increases .rlib file size by typically 10-25%

nnethercote (Nov 20 2019 at 23:02, on Zulip):

Given that they are regular 1MiB+, I'm reluctant to cause that much disk usage increase

Alex Crichton (Nov 20 2019 at 23:08, on Zulip):

@nnethercote I think bitcode and bytecode are the same

Alex Crichton (Nov 20 2019 at 23:08, on Zulip):

@nnethercote honestly actually I think there's a better solution here, albeit more involved

Alex Crichton (Nov 20 2019 at 23:08, on Zulip):

the only purpose of bytecode in rlibs is for LTO

Alex Crichton (Nov 20 2019 at 23:09, on Zulip):

but most projects don't use LTO

Alex Crichton (Nov 20 2019 at 23:09, on Zulip):

actually, so here's a real kicker

Alex Crichton (Nov 20 2019 at 23:09, on Zulip):

ok so ignore libstd

Alex Crichton (Nov 20 2019 at 23:09, on Zulip):

if you execute LTO

Alex Crichton (Nov 20 2019 at 23:09, on Zulip):

then you do a huge amount of object file codegen in the middle, none of which is used, all of which is wasted time

nnethercote (Nov 20 2019 at 23:09, on Zulip):

(For this whole project (https://github.com/mozilla/fix-stacks) the rlibs total size goes from 276,561,064 to 347,907,278, a 1.26x increase)

Alex Crichton (Nov 20 2019 at 23:09, on Zulip):

if you dont' do LTO then you're creating a bunch of bitcode that's never used

Alex Crichton (Nov 20 2019 at 23:10, on Zulip):

so the real solution here is to probably add two flags to the compiler which Cargo automatically passes:

Alex Crichton (Nov 20 2019 at 23:10, on Zulip):

That's both compile time wins and disk space wins all across the board

Alex Crichton (Nov 20 2019 at 23:10, on Zulip):

and everything will be turned on by default as soon as we land it in cargo

Alex Crichton (Nov 20 2019 at 23:10, on Zulip):

does that make sense?

nnethercote (Nov 20 2019 at 23:12, on Zulip):

So when would bitcode be omitted, and when would the object be omitted?

Alex Crichton (Nov 20 2019 at 23:12, on Zulip):

if you compile with LTO, there's no need to put object files in rlibs

Alex Crichton (Nov 20 2019 at 23:12, on Zulip):

(nor is there any reason to compress the bitcode found in the rlib)

Alex Crichton (Nov 20 2019 at 23:12, on Zulip):

we have a flag like that with -C for cross-lang LTO, but rust projects should be using that as well

nnethercote (Nov 20 2019 at 23:12, on Zulip):

ok, so the rlib should contain: object XOR bitcode

nnethercote (Nov 20 2019 at 23:12, on Zulip):

?

Alex Crichton (Nov 20 2019 at 23:13, on Zulip):

correct

Alex Crichton (Nov 20 2019 at 23:13, on Zulip):

except for libstd, but ignore that

nnethercote (Nov 20 2019 at 23:13, on Zulip):

sounds great!

Alex Crichton (Nov 20 2019 at 23:13, on Zulip):

for everything cargo produces locally it should be xor

Alex Crichton (Nov 20 2019 at 23:13, on Zulip):

so we either (a) don't produce any bitcode, nor compress it, or (b) don't codegen something that's not needed

Alex Crichton (Nov 20 2019 at 23:13, on Zulip):

both of which can be pretty significant savings

nnethercote (Nov 20 2019 at 23:14, on Zulip):

Definitely

nnethercote (Nov 20 2019 at 23:14, on Zulip):

It seems that uncompressed bitcode/bytecode is also a thing: https://github.com/rust-lang/rust/blob/6576f4be5af31a5e61dfc0cf50b7130e6c6dfb35/src/librustc/dep_graph/graph.rs#L910-L914

Alex Crichton (Nov 20 2019 at 23:15, on Zulip):

yeah so @mw might be able to help out there to explain more

Alex Crichton (Nov 20 2019 at 23:15, on Zulip):

but there's a -C flag which makes our rlibs "cross lang lto compatible"

Alex Crichton (Nov 20 2019 at 23:15, on Zulip):

which means that all the *.o files are actually uncompress bitcode

Alex Crichton (Nov 20 2019 at 23:15, on Zulip):

uncompressed*

Alex Crichton (Nov 20 2019 at 23:15, on Zulip):

note though that this is a pretty simple concept

Alex Crichton (Nov 20 2019 at 23:16, on Zulip):

but there's a fair amont of legwork to get it all hooked up in the compiler

Alex Crichton (Nov 20 2019 at 23:16, on Zulip):

for example the LTO passes in rustc need to get updated of how they look for bitcode, it's either a "libstd rlib" where it's located adjacent to the object or it's a "cargo rlib" where it's the *.o file

nnethercote (Nov 20 2019 at 23:20, on Zulip):

@Alex Crichton You've lost me... is this "legwork" about the original idea (object XOR bitcode), or about the uncompressed bitcode stuff?

Alex Crichton (Nov 20 2019 at 23:35, on Zulip):

@nnethercote the xor idea

Alex Crichton (Nov 20 2019 at 23:36, on Zulip):

I suspect uncompressed bitcode is dead in the water if it inflates sizes that much

nnethercote (Nov 20 2019 at 23:37, on Zulip):

I agree my original idea of not compressing bitcode is not good. But 'object XOR bitcode' would make it much less interesting anyway, since only LTO builds would end up with bitcode.

Alex Crichton (Nov 20 2019 at 23:38, on Zulip):

@nnethercote so actually

Alex Crichton (Nov 20 2019 at 23:38, on Zulip):

To start I think we just need a flag to skip bitcode

Alex Crichton (Nov 20 2019 at 23:38, on Zulip):

And that's it

nnethercote (Nov 20 2019 at 23:38, on Zulip):

that sounds easy :)

Alex Crichton (Nov 20 2019 at 23:38, on Zulip):

The hard part later is making lto faster but that's less interesting

Alex Crichton (Nov 20 2019 at 23:38, on Zulip):

If you add that flag I can whip up a patch for cargo

nnethercote (Nov 20 2019 at 23:39, on Zulip):

cool, I'll poke around, try to work it out

nnethercote (Nov 20 2019 at 23:39, on Zulip):

thanks for the help!

Alex Crichton (Nov 20 2019 at 23:39, on Zulip):

So I'm.basically countering your uncompressed bitcode

Alex Crichton (Nov 20 2019 at 23:39, on Zulip):

With let's delete bitcode lol

nnethercote (Nov 20 2019 at 23:42, on Zulip):

"make it faster" is good, but "don't do it at all" is better :)

nnethercote (Nov 21 2019 at 21:43, on Zulip):

https://github.com/rust-lang/rust/pull/66598 is the PR, for anyone following along. Big wins!

nnethercote (Dec 02 2019 at 22:11, on Zulip):

@mw Thanks for writing up #66961! I'm considering adding this to my list of tasks for Q1 2020, because I suspect it won't get done otherwise. Does that sound right to you?

mw (Dec 03 2019 at 09:05, on Zulip):

@nnethercote Yes, that sounds good. I can do the reviewing then.

Last update: Dec 12 2019 at 01:20UTC