Stream: t-compiler/wg-parallel-rustc

Topic: effect on firefox


nikomatsakis (May 21 2019 at 10:44, on Zulip):

Hey @mw -- how hard would it be to test the effectiveness of the parallel stuff on building FF? Say on one of the "standard" 14-core desktops?

nikomatsakis (May 21 2019 at 10:44, on Zulip):

There is some debate about how useful it will be

nikomatsakis (May 21 2019 at 10:44, on Zulip):

It seems like we could get some preliminary information in that regard

nikomatsakis (May 21 2019 at 10:45, on Zulip):

Seems like we would have to build a custom version of Rust and then use it in the FF building script somehow?

mw (May 21 2019 at 11:17, on Zulip):

doing a local build with a custom rustc is straightforward (i.e. it uses whatever rustc in on the PATH)

mw (May 21 2019 at 11:17, on Zulip):

I don't have the standard 14-core desktop

mw (May 21 2019 at 11:18, on Zulip):

but I could do a test on my 10-core machine

mw (May 21 2019 at 11:19, on Zulip):

suspect that it won't be much faster (if at all) for the same reasons that pipelining doesn't seem to make much of a difference

mw (May 21 2019 at 11:19, on Zulip):

i.e. all cores are busy compiling small crates or small C++ files

nikomatsakis (May 21 2019 at 11:22, on Zulip):

@mw likely true, it would be good to validate that hypothesis

mw (May 21 2019 at 11:51, on Zulip):

I'll look into it

simulacrum (May 21 2019 at 13:14, on Zulip):

I think that's plausible but rustc did have some gain fwiw

simulacrum (May 21 2019 at 13:14, on Zulip):

(Apparently my old benchmarks were wrong)

nikomatsakis (May 21 2019 at 13:18, on Zulip):

(Apparently my old benchmarks were wrong)

Oh?

simulacrum (May 21 2019 at 13:25, on Zulip):

Yeah, see last row in the spreadsheet -- https://docs.google.com/spreadsheets/d/1vadQWQQqTODU1_cAENnUjLyXM6cxms-tiCf2kCiNGGM/edit#gid=0&range=26:26

simulacrum (May 21 2019 at 13:25, on Zulip):

rustc-opt 438.81 432.23 389.37 -6.59 -49.44 -1.50% -11.27%

simulacrum (May 21 2019 at 13:26, on Zulip):

so we see about 11% crate graph (i.e., x.py build --stage 0) speed up with 2 threads

mw (May 21 2019 at 13:57, on Zulip):

bootstrapping parallel rustc takes forever right now

simulacrum (May 21 2019 at 13:59, on Zulip):

I do think those numbers are for regular rustc, just bootstrapped by a parallel compiler

simulacrum (May 21 2019 at 13:59, on Zulip):

Not sure.

mw (May 21 2019 at 14:02, on Zulip):

I'm trying to build a parallel compiler for testing and it's been stuck at libsyntax for an hour or so

mw (May 21 2019 at 14:02, on Zulip):

https://github.com/rust-lang/rust/pull/60035#issuecomment-494001207

mw (May 21 2019 at 14:02, on Zulip):

which is what @Zoxc is probably taking about in the above comment

lwshang (May 21 2019 at 14:08, on Zulip):

I encounter the same issue that stuck at libsyntax. It never completes. I let it compile overnight but no progress after libsyntax.

mw (May 21 2019 at 14:08, on Zulip):

that seems worrisome

simulacrum (May 21 2019 at 14:43, on Zulip):

I think that's a recent thing

simulacrum (May 21 2019 at 14:43, on Zulip):

I know for sure I've compiled with parallel enabled and not really seen any change in bootstrap time before

Zoxc (May 21 2019 at 15:43, on Zulip):

The lack of caching in https://github.com/rust-lang/rust/pull/60444 is causing compiling syntax to take very long or looping forever.

Zoxc (May 21 2019 at 17:58, on Zulip):

This happens when building with a non-parallel compiler too

lwshang (May 21 2019 at 19:24, on Zulip):

How can I make the compilation succeed?

Zoxc (May 21 2019 at 19:27, on Zulip):

Use https://github.com/rust-lang/rust/pull/60967

lwshang (May 21 2019 at 21:22, on Zulip):

It works! Just got a successful compilation of rustc.

To have two rustc (with/without parallel), should I have two copies of the rust repo with different config.toml? Or I just locally compile the parallel enabled rustc and compare it with the latest nightly?

Zoxc (May 21 2019 at 21:24, on Zulip):

Two rustc repos is the safe option, with the same config.toml, except for parallel-compiler

mw (May 22 2019 at 11:52, on Zulip):

Here are the numbers I'm getting (using the compiler from https://github.com/rust-lang/rust/pull/60967):

mw (May 22 2019 at 11:52, on Zulip):

so the parallel compiler is slight slower overall

mw (May 22 2019 at 11:54, on Zulip):

this is for building from scratch FF on a 10 core / 20 thread Xeon processor

simulacrum (May 22 2019 at 14:32, on Zulip):

@mw What are you setting for -Zthreads and -j?

simulacrum (May 22 2019 at 14:32, on Zulip):

I'd recommend something like -Zthreads=3

mw (May 22 2019 at 14:34, on Zulip):

no explicit settings

mw (May 22 2019 at 14:34, on Zulip):

I'd recommend something like -Zthreads=3

let me give that a try

simulacrum (May 22 2019 at 14:34, on Zulip):

ah, that's likely to be bad

simulacrum (May 22 2019 at 14:35, on Zulip):

the default is .. not great, based on my timing results

simulacrum (May 22 2019 at 14:35, on Zulip):

i.e., you're hitting the "rising" part of the curve I'd guess

lwshang (May 22 2019 at 14:35, on Zulip):

I believe -Zthreads=1 is the default. Is that correct?

simulacrum (May 22 2019 at 14:36, on Zulip):

no, the default is -Zthreads=20 in this case (i.e., number of vCPUs)

lwshang (May 22 2019 at 14:36, on Zulip):

Is there a heuristic for setting -Zthreads?

simulacrum (May 22 2019 at 14:37, on Zulip):

Uncertain -- we'd need more benchmarking across a variety of CPUs and core counts to be able to tell

simulacrum (May 22 2019 at 14:37, on Zulip):

(i.e., AMD v Intel, could make a difference)

mw (May 22 2019 at 14:37, on Zulip):

right now it will take the number of logical cores

simulacrum (May 22 2019 at 14:38, on Zulip):

My theory is that what we want is for rustc's internal threading to be less aggressive than cargo's spawning of rustc processes, but otherwise only constrained to the number of logical cores

lwshang (May 22 2019 at 14:38, on Zulip):

As suggest by @simulacrum , current default (num of vCPUs) may not be a good choice.

simulacrum (May 22 2019 at 14:39, on Zulip):

In speaking to Alex though I believe that's basically impossible with the current implementation of jobserver

simulacrum (May 22 2019 at 14:39, on Zulip):

-Zthreads=3 might be a good "best guess" of how many threads rustc can safely use though

mw (May 22 2019 at 15:08, on Zulip):
mw (May 22 2019 at 15:09, on Zulip):

let's see for debug builds too...

Zoxc (May 22 2019 at 15:18, on Zulip):

I'd guess 8 threads =P

simulacrum (May 22 2019 at 15:54, on Zulip):

I was using a 8 core / 16 logical core CPU for my determination of -Zthread=3 as more or less optimal

simulacrum (May 22 2019 at 15:55, on Zulip):

so here with more cores it could well be different; plus, I was using a ryzen 1800x so intel might have different performance characteristics

mw (May 23 2019 at 15:04, on Zulip):

here are my numbers for debug builds:

lwshang (May 23 2019 at 15:19, on Zulip):

Is your single threaded meaning {parallel-rustc enabled and -Zthreads=1} or {parallel-rustc disabled}?

mw (May 24 2019 at 09:45, on Zulip):

single threaded means a compiler compiled with parallel-compiler = false

Last update: Nov 17 2019 at 07:35UTC