Stream: t-compiler/wg-parallel-rustc

Topic: Dynamically avoiding atomics


Alex Crichton (Jan 14 2020 at 17:00, on Zulip):

So I've been doing some profiling and I recalled reading a recent article about how glibc C++ avoids using atomics in std::shared_ptr sometimes

Alex Crichton (Jan 14 2020 at 17:00, on Zulip):

I was wondering, we can probably do the same for rustc itself

Alex Crichton (Jan 14 2020 at 17:01, on Zulip):

All the utilities that we have in the sync module (which currently multiplex single/multi threaded at compile time) we could update to work in both worlds dynamically

Alex Crichton (Jan 14 2020 at 17:01, on Zulip):

basically before anything is done a "very fast read" happens which indicates whether an atomic or single-threaded update should happen

Alex Crichton (Jan 14 2020 at 17:01, on Zulip):

we would only enable atomics once a thread is spawned, or a second jobserver token is acquired

Alex Crichton (Jan 14 2020 at 17:02, on Zulip):

this in theory might be a strategy to land everything today and only selectively enable rayon/extra parallelism as we become confident over time

Alex Crichton (Jan 14 2020 at 17:02, on Zulip):

but we could immediately start getting benefits like a truly parallel codegen backend

Alex Crichton (Jan 14 2020 at 17:02, on Zulip):

Basically what I'm wondering is whether we can land something like this, basically removing all compile-time checks in the rustc data structures for parallel_compiler, instead changing them to all runtime checks (cheap ones)

Alex Crichton (Jan 14 2020 at 17:03, on Zulip):

that may have such a small impact we could land it immediately, and we could unblock a lot of work that's waiting for parallel rustc to get fully enabled

Alex Crichton (Jan 14 2020 at 17:03, on Zulip):

(things like the default for -Zthreads would still be conditioned at compile time on parallel_compiler perhaps)

simulacrum (Jan 14 2020 at 20:00, on Zulip):

So basically we'd make the primitives etc always require Send/Sync then? That's the main "blocker" I see -- but one other thought is that I'm not sure it's worth the hassle

simulacrum (Jan 14 2020 at 20:00, on Zulip):

i.e., I feel like we're pretty close to being able to just turn it on without this

Alex Crichton (Jan 14 2020 at 20:35, on Zulip):

well not so much that they would also require Send/Sync they'd just always conditionally be Send/Sync

Alex Crichton (Jan 14 2020 at 20:36, on Zulip):

and also, I should clarify, this was borne out of some thinking about our current state of affairs

Alex Crichton (Jan 14 2020 at 20:36, on Zulip):

and I realized that if you already have a fully parallel build, as is likely the case on a small number of cpu machines, then you are basically guaranteed to regress

Alex Crichton (Jan 14 2020 at 20:36, on Zulip):

the only improvement we offer is if you have idle parallelism we can gobble that up with a parallel rustc, but times like the start of a full build or most of the build on a small number of cores, have no idle parallelism

Alex Crichton (Jan 14 2020 at 20:37, on Zulip):

so you're paying the cost of all this synchronization when none of it actually gets used

Alex Crichton (Jan 14 2020 at 20:37, on Zulip):

so my initial thinking was that this is what we may want long-term, not only for the immediate short term to land things

Alex Crichton (Jan 14 2020 at 20:37, on Zulip):

becuase this would, in theory, drastically mitigate the cost of a parallel compiler

simulacrum (Jan 14 2020 at 20:55, on Zulip):

That is true

simulacrum (Jan 14 2020 at 20:56, on Zulip):

hm, I'm not sure about conditionally Send/Sync

simulacrum (Jan 14 2020 at 20:56, on Zulip):

They'd need to be Send/Sync unconditionally, right? And we need to be just super careful to avoid spawning threads outside our control if we pursue this

simulacrum (Jan 14 2020 at 20:57, on Zulip):

(which we probably can do)

Alex Crichton (Jan 14 2020 at 21:09, on Zulip):

er, conditional as in impl<T: Send> Send for Lock<T> (or w/e it is)

Alex Crichton (Jan 14 2020 at 21:09, on Zulip):

not conditional as in #[cfg]

simulacrum (Jan 14 2020 at 21:16, on Zulip):

ah sure yeah makes sense

Zoxc (Jan 15 2020 at 06:29, on Zulip):

We could avoid atomic ops if -Z threads=1 is passed on the command line. Doing it on demand sounds very tricky though and we'd probably be better off just avoiding atomic ops.

Alex Crichton (Jan 15 2020 at 15:07, on Zulip):

My point though is that -Zthreads=1 is a static assertion that no threads are used and I don't think it's what we want

Alex Crichton (Jan 15 2020 at 15:08, on Zulip):

we are slowing down all rustc instances at the beginning of a very parallel build because they're all doing atomic ops with only one thread now

Alex Crichton (Jan 15 2020 at 15:08, on Zulip):

so avoiding the cost may be worth it

Zoxc (Jan 16 2020 at 07:35, on Zulip):

I guess we can apply the transition only at well defined points where no locks are held and no other threads accessing rustc state exist, which would make it simpler.

Zoxc (Jan 16 2020 at 08:07, on Zulip):

I wonder if we could have some DynSend and DynSync traits since our types won't really be Send and Sync.

Zoxc (Jan 16 2020 at 12:31, on Zulip):

We could prevent transition to a parallel state while holding a lock by having a ref count of held locks.

Alex Crichton (Jan 16 2020 at 17:00, on Zulip):

So on this topic @simulacrum had a really good idea during the meeting today, which is that if we want to do this work, we should start out with a CLI flag that's "the switch" rather than a dynamic detection of "the switch". It would involve all the same implementation work, just a slightly different condition on what to use. I think that if the cost of the single-threaded, yet parallel-capable, compiler is high enough we can pursue this, but otherwise it's just a fun idea to keep around for awhile

Last update: Jul 02 2020 at 19:40UTC