Stream: t-compiler/wg-nll

Topic: profiling


nikomatsakis (Jul 25 2018 at 11:37, on Zulip):

@lqd so we were talking about profiling yesterday

lqd (Jul 25 2018 at 11:38, on Zulip):

I'm gathering the up-to-date crate timings (just before gathering the callgrind profile) now

lqd (Jul 25 2018 at 11:39, on Zulip):

I also have -Zverbosed html5ever and the &str inside the tuples, were indeed of different regions, instead of 'static :/

nikomatsakis (Jul 25 2018 at 11:40, on Zulip):

yeah; we could maybe have some special case though for checking statics

nikomatsakis (Jul 25 2018 at 11:40, on Zulip):

where we don't make as many variables

nikomatsakis (Jul 25 2018 at 11:40, on Zulip):

or we do something else smart

nikomatsakis (Jul 25 2018 at 11:40, on Zulip):

I'm gathering the up-to-date crate timings (just before gathering the callgrind profile) now

where are those?

lqd (Jul 25 2018 at 11:40, on Zulip):

https://hackmd.io/3r1ZnkUyRD2NMU-dnm_RPg

nikomatsakis (Jul 25 2018 at 11:40, on Zulip):

I was curious to try and get some sort of profile of e.g. cargo

nikomatsakis (Jul 25 2018 at 11:41, on Zulip):

clap-rs and inflate remain stubbornly high, though I know that they are sort of "not master" versions

nikomatsakis (Jul 25 2018 at 11:41, on Zulip):

huh, wait, is NLL faster on html5ever?

nikomatsakis (Jul 25 2018 at 11:42, on Zulip):

also, we should add a ratio

lqd (Jul 25 2018 at 11:42, on Zulip):

hmm lemme check again my numbers I might have skipped a column :/

nikomatsakis (Jul 25 2018 at 11:42, on Zulip):

(doing that now)

lqd (Jul 25 2018 at 11:44, on Zulip):

(sentry-cli's deps are so long to build I had to skip it for now)

nikomatsakis (Jul 25 2018 at 11:44, on Zulip):

ok, I added some ratios

nikomatsakis (Jul 25 2018 at 11:44, on Zulip):

it's..actually looking pretty good

nikomatsakis (Jul 25 2018 at 11:45, on Zulip):

most of them are <10% it seems?

lqd (Jul 25 2018 at 11:46, on Zulip):

yes it's not bad in many cases

nikomatsakis (Jul 25 2018 at 11:47, on Zulip):

I'm triyng to decide which would be good to dig into, or if we think we are "done" here

lqd (Jul 25 2018 at 11:47, on Zulip):

yeah it was 1.28 for html5ever instead of 1.88

nikomatsakis (Jul 25 2018 at 11:48, on Zulip):

hmm?

nikomatsakis (Jul 25 2018 at 11:48, on Zulip):

maybe update the chart? :)

lqd (Jul 25 2018 at 11:48, on Zulip):

yeah I did update it ;)

lqd (Jul 25 2018 at 11:49, on Zulip):

so NLL is 10% slower on it rather than faster as I had mistakenly written

lqd (Jul 25 2018 at 11:55, on Zulip):

for html5ever it seems that phf switched from having an inner type wrapping the tuples, to using arrays/slices directly, and that makes all the difference

nikomatsakis (Jul 25 2018 at 11:57, on Zulip):

I wonder if the thing to do is to look at the older versions for optimizing

nikomatsakis (Jul 25 2018 at 11:57, on Zulip):

or the newer ones :)

nikomatsakis (Jul 25 2018 at 11:57, on Zulip):

it might be interesting to compare the clap-rs profiles (or inflate profiles) before/after

lqd (Jul 25 2018 at 11:58, on Zulip):

probably a mix of both old and new :)

lqd (Jul 25 2018 at 11:58, on Zulip):

yeah after gathering those basic times I'll get some profiles, they'll probably be more informative rather than absolutely clear etc, but could be useful

lqd (Jul 25 2018 at 11:59, on Zulip):

but since it takes so long with valgrind I thought I'd do the easy timings first

nikomatsakis (Jul 25 2018 at 12:02, on Zulip):

that was indeed very helpful

nikomatsakis (Jul 25 2018 at 12:02, on Zulip):

particularly in giving an overall picture

lqd (Jul 25 2018 at 12:05, on Zulip):

weirdly I can't build servo's script because of deps error, so maybe the deps are built without NLLs on perf.rlo and then the crate itself is built with NLLs.; that is, not using RUSTFLAGS. (and it's a complex build process, involving spidermonkey and all so I'm not sure I'll be able to build it at all ;)

nikomatsakis (Jul 25 2018 at 15:32, on Zulip):

ever do any detailed profiling @lqd ?

lqd (Jul 25 2018 at 15:33, on Zulip):

I did manage to do a profile of the modern inflate with and without nll, but as expected callgrind's results are a bit harder for me to make a good use of

lqd (Jul 25 2018 at 15:34, on Zulip):

I was doing the older version now, to compare

lqd (Jul 25 2018 at 15:34, on Zulip):

I can upload them somewhere if you want

nikomatsakis (Jul 25 2018 at 15:35, on Zulip):

can you upload the text output?

nikomatsakis (Jul 25 2018 at 15:35, on Zulip):

I forget how that looks now

lqd (Jul 25 2018 at 15:38, on Zulip):

it's in a weird text format qcachegrind supports, like

version: 1
creator: callgrind-3.11.0
pid: 28740
cmd:  /home/lqd/work/rust/rust/build/x86_64-unknown-linux-gnu/stage2/bin/rustc --crate-name inflate src/lib.rs --crate-type lib --emit=dep-info,metadata -C debuginfo=2 -Zborrowck=mir -Ztwo-phase-borrows --cfg feature="default" -C metadata=9957d65288bb6b74 -C extra-filename=-9957d65288bb6b74 --out-dir /mnt/d/work/rust/crater/inflate-0.4.3/target/debug/deps -L dependency=/mnt/d/work/rust/crater/inflate-0.4.3/target/debug/deps --extern adler32=/mnt/d/work/rust/crater/inflate-0.4.3/target/debug/deps/libadler32-f8bc9ddd0904ee51.rmeta
part: 1


desc: I1 cache:
desc: D1 cache:
desc: LL cache:

desc: Timerange: Basic block 0 - 380728636
desc: Trigger: Program termination

positions: line
events: Ir
summary: 2054956762


ob=(20) /home/lqd/work/rust/rust/build/x86_64-unknown-linux-gnu/stage2/lib/libproc_macro-a473a3a22af04081.so
fl=(112) ???
fn=(100614) 0x0000000000007fc0
0 9

fn=(100604) 0x0000000000008050
0 8
cob=(2) ???
cfi=(16) ???
cfn=(100610) 0x000000000b6c9f80
calls=1 0
0 63
0 1
cfn=(100614)
calls=1 0
0 9
0 3

fn=(550) 0x0000000000008090
0 17

ob=(35) /home/lqd/work/rust/rust/build/x86_64-unknown-linux-gnu/stage2/lib/librustc_incremental-b08fd497990db280.so
fl=(127) ???
fn=(100560) 0x000000000000d870
0 9

fn=(100550) 0x000000000000d900
0 8
cob=(2)
cfi=(16)
cfn=(100556) 0x0000000006282408
calls=1 0
0 63
0 1
cfn=(100560)
calls=1 0
0 9
0 3

fn=(730) 0x000000000000d940
0 17
lqd (Jul 25 2018 at 15:38, on Zulip):

versus pasted image

lqd (Jul 25 2018 at 15:39, on Zulip):

I'm not sure I'm not wasting your time with this data :/

nikomatsakis (Jul 25 2018 at 15:39, on Zulip):

I remember there being some tool for outputting the data in text format

lqd (Jul 25 2018 at 15:39, on Zulip):

njn would know I'm sure

lqd (Jul 25 2018 at 15:40, on Zulip):

oh the callgrind annotator maybe

nikomatsakis (Jul 25 2018 at 15:41, on Zulip):

yeah I can't remember hmm

nikomatsakis (Jul 25 2018 at 15:41, on Zulip):

I think it was cg_annotate, yes

lqd (Jul 25 2018 at 15:43, on Zulip):

oh it seems to work, I'll convert and upload them

lqd (Jul 25 2018 at 15:50, on Zulip):

here they are

inflate
lqd (Jul 25 2018 at 15:50, on Zulip):

inflate 0.1 is the one on perf.rlo, inflate 0.4.3 the latest release

Last update: Nov 21 2019 at 13:45UTC