Stream: t-compiler

Topic: crazy idea: locally track `rustc` compile-times per-commit


pnkfelix (Oct 30 2019 at 11:54, on Zulip):

reading over #65927, I was idly wondering whether we could make it easier for developers to catch these regressions locally. Namely, without using perf or any other infrastructure.

simulacrum (Oct 30 2019 at 11:55, on Zulip):

@eddyb and I talked a bit about that -- in theory, this should've been caught by incremental tests, and I believe @mw was surprised it wasn't

pnkfelix (Oct 30 2019 at 11:55, on Zulip):

and that led me to thus potentially bonkers idea: Would it make any sense to track the historical completed build times of stage2 rustc for each commit, and then after each build that is a child of the master commit, compare the median values in the history, and print out the time delta...?

simulacrum (Oct 30 2019 at 11:55, on Zulip):

and it's definitely possible to run perf locally, at least on Linux-based systems -- but that's sort of a non-answer

pnkfelix (Oct 30 2019 at 11:56, on Zulip):

(by perf I meant perf.rlo)

pnkfelix (Oct 30 2019 at 11:56, on Zulip):

eddyb and I talked a bit about that -- in theory, this should've been caught by incremental tests, and I believe mw was surprised it wasn't

the incremental tests check compile times?

simulacrum (Oct 30 2019 at 11:57, on Zulip):

No, but in this case the regression was "we stopped skipping codegen"

pnkfelix (Oct 30 2019 at 11:57, on Zulip):

heh

simulacrum (Oct 30 2019 at 11:57, on Zulip):

which should've been caught

pnkfelix (Oct 30 2019 at 11:57, on Zulip):

i see

simulacrum (Oct 30 2019 at 11:57, on Zulip):

(but we apparently didn't have a test)

pnkfelix (Oct 30 2019 at 11:57, on Zulip):

still, the more general idea may have some value

pnkfelix (Oct 30 2019 at 11:57, on Zulip):

if we can implement it semi-robustly...

simulacrum (Oct 30 2019 at 11:58, on Zulip):

I think part of the problem is that we have really high variance on CI, and locally there's maybe less variance but there's nothing to compare against reliably

simulacrum (Oct 30 2019 at 11:58, on Zulip):

(and in any case, it wouldn't help with this AFAIK)

pnkfelix (Oct 30 2019 at 11:58, on Zulip):

the thing to compare it against is the stored history , I'd say

simulacrum (Oct 30 2019 at 11:58, on Zulip):

since you're almost never recompiling the compiler in a "perfect" incremental scenario

pnkfelix (Oct 30 2019 at 11:58, on Zulip):

Oh I suppose you're right: incremental compiles mean that the times are useless

pnkfelix (Oct 30 2019 at 11:59, on Zulip):

okay so comparing the stage2 rustc times is no good

pnkfelix (Oct 30 2019 at 11:59, on Zulip):

an alternative might be to make a compile-time benchmark suite part of x.py test, and have that store local history in the build directory

pnkfelix (Oct 30 2019 at 12:00, on Zulip):

where the benchmark suite would presumably be some subset of perf.rlo

centril (Oct 30 2019 at 12:00, on Zulip):

@pnkfelix are you expecting devs to use stage2?

simulacrum (Oct 30 2019 at 12:00, on Zulip):

sure, we can do something like that

pnkfelix (Oct 30 2019 at 12:00, on Zulip):

this is not quite as nice as tracking the build time for rustc, since that's a build artifact that people are always going to need ...

pnkfelix (Oct 30 2019 at 12:01, on Zulip):

@centril this was one of the flaws in my plan

pnkfelix (Oct 30 2019 at 12:01, on Zulip):

@centril but a test benchmark suite would not need to rely on a stage2 rustc

centril (Oct 30 2019 at 12:01, on Zulip):

Yes, I think you'll have pitchforks at the gates if you make everyone use stage2 ;)

simulacrum (Oct 30 2019 at 12:01, on Zulip):

I think it could work in theory, though we've not had super reliable measurements historically for e.g. wall time

simulacrum (Oct 30 2019 at 12:02, on Zulip):

and instruction counts I'm not sure how to record on windows/macOS

pnkfelix (Oct 30 2019 at 12:02, on Zulip):

maybe we wouldn't report anything at all unless the wall-clock time was truly absurd

pnkfelix (Oct 30 2019 at 12:02, on Zulip):

i.e. a 10x regression or something

simulacrum (Oct 30 2019 at 12:02, on Zulip):

we could, sure -- I personally think it would not be helpful, i.e., in practice doesn't catch enough

centril (Oct 30 2019 at 12:03, on Zulip):

Personally I'm happy that there's a central perf.rl.o cause I'm one of those egghead type-theory types that knows little about perf and wouldn't be able to handle local perf testing in a reliable way

simulacrum (Oct 30 2019 at 12:04, on Zulip):

basically a 10x regression would time out CI reliably

pnkfelix (Oct 30 2019 at 12:04, on Zulip):

we could, sure -- I personally think it would not be helpful, i.e., in practice doesn't catch enough

you mean because a 10x regression, while filtering noise, is too high a threshold in general? E.g. for #65927, only half of the benchmarks regressed at all, and of those, the median regression was like 3x or so?

pnkfelix (Oct 30 2019 at 12:05, on Zulip):

maybe part of the problem here is that I'm coupling two distinct things: Gathering the data itself, and reporting it by default during build, and/or gating on it

pnkfelix (Oct 30 2019 at 12:05, on Zulip):

i.e. it might be interesting to have the information captured locally and some easy way to query it

simulacrum (Oct 30 2019 at 12:09, on Zulip):

we can definitely start capturing the information (e.g., run cargo with -Ztiming and save that off) -- I suspect most people wouldn't really find this helpful, but we can do it

pnkfelix (Oct 30 2019 at 12:12, on Zulip):

I think the trickiest thing, in terms of my ideal setup, would be making sure the data-sets are properly separated. In particular, its not enough to treat the commit-id as a unique key to identify the data-set; I think you'd probably want commit-id + git diff output, or something.

pnkfelix (Oct 30 2019 at 12:12, on Zulip):

that, or only do the tracking if the git diff is empty

pnkfelix (Oct 30 2019 at 12:13, on Zulip):

(which is probably the best way to go. At least for me, the point where I commit is usually the point where I'm saying "okay I finally got this into a plausible and buildable state.)

simulacrum (Oct 30 2019 at 12:18, on Zulip):

yes, I think that'd be fine, some similar heuristic (maybe enough, then, to just save by commit id and a bit of "clean git diff")

simulacrum (Oct 30 2019 at 12:19, on Zulip):

which would overwrite by commit id so if you did end up building in clean mode then you'd be fine

Last update: Nov 22 2019 at 05:30UTC