Stream: t-compiler/wg-nll

Topic: benchmarks


nikomatsakis (May 25 2018 at 08:53, on Zulip):

@lqd (cc @Frank McSherry) — it looks like some new benchmarks were recently added to perf — in particular webrender seems to exhibit ungreat NLL perf, maybe we should add that one for testing.

nikomatsakis (May 25 2018 at 08:53, on Zulip):

also cargo

lqd (May 25 2018 at 08:54, on Zulip):

:thumbs_up:

lqd (May 25 2018 at 08:55, on Zulip):

did Reed's PR to have the invalidates facts land in rustc ?

lqd (May 25 2018 at 08:58, on Zulip):

at least so I can quickly try 1) leapfrog on it, 2) the loc-insensitive prepass on it

lqd (May 25 2018 at 09:01, on Zulip):

/me tries -Znll-facts :3, result: it does have them all yay

nikomatsakis (May 25 2018 at 09:35, on Zulip):

one challenge: figuring out which fns are slow. Might be just the biggest ones. Before I did timing measurements. (Would also be worth looking again at clap to see which other fns are slow.) I don't know if those timing measurements are in nightly or not though (you could see the results for indiv functions with -Ztime-passes, I probably would have to recreate the patch)

lqd (May 25 2018 at 09:36, on Zulip):

this is what was inside my reply box but didn't hit send :) webrender has 4-5 subcrates, dependencies, etc & must have a lot of functions

Frank McSherry (May 25 2018 at 09:37, on Zulip):

More benchmarks is great. Is there any reason a :frog: based non-NLL phase (perhaps this is the location-invariant version) which does the old LL borrow checking should be any slower? Probably not worth worrying too much about the perf on problems well addressed by that, and rather just on the gap between the two.

Frank McSherry (May 25 2018 at 09:37, on Zulip):

(( Also, just responded to @nikomatsakis's mention, and then realized that there is already some chitchat, sorry. ))

nikomatsakis (May 25 2018 at 09:38, on Zulip):

this is basically the location-insensitive variant (although it is not lexical)

nikomatsakis (May 25 2018 at 09:39, on Zulip):

the location-insensitive variant is actually less expressive in some ways — notably it doesn't track where a borrow was introduced and limit its effects to locations reachable from there — but I think likely still serves as an effective pre-screen

nikomatsakis (May 25 2018 at 09:39, on Zulip):

keep in mind though that the NLL numbers we see on perf — at least based on the profiles i've done elsewhere, i'll have to repeat for the new cases — are often registering overhead that occurs before/after the "core analysis" that polonius models anyway

nikomatsakis (May 25 2018 at 09:40, on Zulip):

(although a major source of the overhead for clap is still squarely in the dataflow that polonius subsumes)

nikomatsakis (May 25 2018 at 09:41, on Zulip):

(if needed, we could make the location-insensitive variant somewhat more precise of course)

lqd (May 25 2018 at 09:41, on Zulip):

just trying it on cargo/webrender would answer if we need to make it more precise right ?

lqd (May 25 2018 at 09:42, on Zulip):

or would the imprecisions be in what it doesn't output, most likely ?

Frank McSherry (May 25 2018 at 09:42, on Zulip):

My intuition (based only on programs that I've written) is that most borrowing / lifetimes don't require NLL, and for the borrows that can be dispatched early with traditional reasoning, they just all get dropped from the NLL input and hooray.

nikomatsakis (May 25 2018 at 09:42, on Zulip):

if we try it and find it emits no errors, then we are satisfied (for now)

nikomatsakis (May 25 2018 at 09:43, on Zulip):

if it does emit errors, then either we try and adopt the approach where we use those errors to limit the work of the location-sensitive variant — or else we make it more precise so that it screens out errors

Frank McSherry (May 25 2018 at 09:43, on Zulip):

Sort of related Q: are there any/many benchmarks of programs that exercise the NLL nature of NLL? Like, large programs that don't pass current borrow_ck but should.

nikomatsakis (May 25 2018 at 09:43, on Zulip):

no

nikomatsakis (May 25 2018 at 09:43, on Zulip):

I guess they will come in time :)

Frank McSherry (May 25 2018 at 09:44, on Zulip):

So, hypothetically if NLL gets turned on there may be a spike in "whoa, we didn't see this sort of behavior before."

nikomatsakis (May 25 2018 at 09:44, on Zulip):

(relating to your intution, clearly every extant rust program does not need NLL...)

nikomatsakis (May 25 2018 at 09:45, on Zulip):

I'm not sure exactly what you mean by "this behavior" — like, these sorts of compile times? oh, just "programs exhibiting these properties"?

Frank McSherry (May 25 2018 at 09:46, on Zulip):

I guess I was thinking "performance defects in NLL reasoning" akin to whatever might be stressing out webrender.

nikomatsakis (May 25 2018 at 09:46, on Zulip):

seems plausible

nikomatsakis (May 25 2018 at 09:47, on Zulip):

I certainly expect a period — after turning on NLL — of bug reports related to it, whether they be perf or correctness...

lqd (May 25 2018 at 09:49, on Zulip):

can we get facts when using cargo ?

nikomatsakis (May 25 2018 at 09:50, on Zulip):

cargo rustc -- -Znll-facts probably works

Frank McSherry (May 25 2018 at 09:55, on Zulip):

It would be pretty not-hard to add in a bit of diagnostic code: in each of the join things, one can run a timer and attribute the resulting Duration to the destination relation, and print everything out in Drop code. I've got something that does this for tuple counts already, but .. should the need arise for more consistent diagnosis stuff (a la souffle).

lqd (May 25 2018 at 09:56, on Zulip):

unfortunately facts with cargo doesn't produce anything, this is going to be tougher than I expected :) it's time to bring out cargo -v :3

nikomatsakis (May 25 2018 at 10:06, on Zulip):

@lqd I think you need to add #![feature(nll)] too (or the suitable -Z flags)

lqd (May 25 2018 at 10:07, on Zulip):

oh interesting, rustc-perf is a bit obscure from the outside :)

lqd (May 25 2018 at 10:18, on Zulip):

good to know, indeed cargo + the feature = :thumbs_up:

lqd (May 25 2018 at 10:31, on Zulip):

interesting, just checking the facts, it might also be not a single slow function, they seem small-ish for webrender itself (the biggest 20 fns combined are less than the clap dataset) so maybe time-passes would be indeed worthwhile (or could also be shared in slow dependencies)

lqd (May 25 2018 at 10:45, on Zulip):

@nikomatsakis what I seeing is this: 1) time: 130.060; rss: 292MB MIR borrow checking 2) a couple thousand solve_nll_region_constraintstimed at 0.000 3) 2 or 3 timed at 0.001 -- should I be looking at something in particular ?

lqd (May 25 2018 at 10:57, on Zulip):

(btw, is rustc doing the NLL analysis in parallel, e.g. $nb_cores functions at a time ? if not, could we now ? there must some intricacies collating results, but at least spinning multiple datafrog computations at the same time seems doable)

nikomatsakis (May 25 2018 at 12:25, on Zulip):

We are not. I would encourage you not to think about parallelism: I think we should strive to make it work on a single core.

nikomatsakis (May 25 2018 at 12:25, on Zulip):

My reasoning:

nikomatsakis (May 25 2018 at 12:25, on Zulip):

1. we are actively working on adding parallelism within crates and queries, which would mean taht we would process N functions at once.

nikomatsakis (May 25 2018 at 12:26, on Zulip):

2. we often compile N crates at once

nikomatsakis (May 25 2018 at 12:26, on Zulip):

3. once we have those pieces in place, we can yes imagine doing parallel sorts and so forth — but we would want to balance resource usage overall

nikomatsakis (May 25 2018 at 12:26, on Zulip):

I think we have a story there, but we shouldn't look to parallelism alone as the "salvation", I guess is what i'm saying .. often the cores will be busy elsewhere :)

lqd (May 25 2018 at 12:27, on Zulip):

agreed, it was just a random thought :)

nikomatsakis (May 25 2018 at 12:27, on Zulip):

that said, we should probably try it out

nikomatsakis (May 25 2018 at 12:27, on Zulip):

so let me weaken my statement :)

nikomatsakis (May 25 2018 at 12:27, on Zulip):

that is, once we have those pieces in place — in particular, rustc will have a fork of rayon that will hopefully eventually be the real rayon — it'd be nice if we already knew how best to take advantage of it!

lqd (May 25 2018 at 12:30, on Zulip):

@nikomatsakis btw did you see the previous "130s mir borrow checking" without easy to notice slow subtasks ? is there maybe a way to have more information about the times passes (besides profiling rustc)

nikomatsakis (May 25 2018 at 12:30, on Zulip):

@lqd so time-passes is basically useless

nikomatsakis (May 25 2018 at 12:30, on Zulip):

and you should ignore it

nikomatsakis (May 25 2018 at 12:31, on Zulip):

that is, it is not telling you what it looks like it is telling you

nikomatsakis (May 25 2018 at 12:31, on Zulip):

we are in the process of replacing it with something that will give realistic numbers

nikomatsakis (May 25 2018 at 12:31, on Zulip):

e.g., in that output, it is not clear what composes those 130s

nikomatsakis (May 25 2018 at 12:31, on Zulip):

it includes at least mir borrow checking...but quite possibly other things, like mir construction

nikomatsakis (May 25 2018 at 12:32, on Zulip):

that said, I had a locally extended version of the compiler

nikomatsakis (May 25 2018 at 12:32, on Zulip):

that hijacked time-passes to dump per-fn information in a very narrow way :)

nikomatsakis (May 25 2018 at 12:32, on Zulip):

and I was using that to identify slow functions

lqd (May 25 2018 at 12:32, on Zulip):

nifty :)

nikomatsakis (May 25 2018 at 12:33, on Zulip):

so @lqd what info were you looking for exactly? (before I went on my rant...)

nikomatsakis (May 25 2018 at 12:33, on Zulip):

I guess the short answer is no, there is no easy way to get info — profiling rustc (e.g., with perf) is the way to do it

lqd (May 25 2018 at 12:33, on Zulip):

I tried compiling webrender with NLL and it was indeed "slow", so I was looking for a way to narrow down where this time was spent

lqd (May 25 2018 at 12:34, on Zulip):

more precisely which "use case" could be extracted for benchmarking in polonius

lqd (May 25 2018 at 12:35, on Zulip):

(before I leave for rustfest until tuesday)

nikomatsakis (May 25 2018 at 12:35, on Zulip):

got it

nikomatsakis (May 25 2018 at 12:35, on Zulip):

let me go looking for my patch

Reed Koser (May 25 2018 at 16:58, on Zulip):

If anyone feels like setting up LTTNG on their machine, I do have a local patch that dumps some stats about variable updates over the their user-space tracing (UST) channels.

Reed Koser (May 25 2018 at 16:58, on Zulip):

Should be possible to upstream it, it's all behind a feature flag but I'm not sure how useful it would be

Jake Goulding (May 25 2018 at 18:11, on Zulip):

@Reed Koser LTTNG is an interesting project! Do you happen to know if it supports macOS or an alternate that might?

Reed Koser (May 25 2018 at 18:16, on Zulip):

I believe it's linux-only unfortunately. Probably not for technical reasons (i.e. you could port it) but just because there is only a relatively small number of contributors. I don't know what OSX has for tracing unfortunately

Reed Koser (May 25 2018 at 18:18, on Zulip):

most of the really robust tracing tools are deeply hooked in to the kernels of their respective systems (general purpose tracers) or bespoke (things like Chrome/V8's internal profiler/Gecko's profiling tools, etc.)

Reed Koser (May 25 2018 at 18:21, on Zulip):

even LTTNG started as a kernel tracing tool, and TraceCompass (the officially sanctioned graphical frontend) is... sub par. You define visualizations using this weird and feature-incomplete XML DSL or by using babeltrace to pull the traces into Python and then using some of the stuff from Python's data science community to generate imagery

Reed Koser (May 25 2018 at 18:24, on Zulip):

cross-platform userspace tracing is on the (extremely long...) list of yaks I want to shave some day :upside_down_face:

Last update: Nov 21 2019 at 13:55UTC