Also, @mw and I were talking earlier and discussing the "perf integration". Do we have any idea of "instruction count ovehead" that enabling this feature brings? Memory overhead, I understand, is substantial.
mw had a PR where we measured it
Ah, hmm, not that bad.
My opinion here is that we should probably do it :) at least, it seems so super useful to be able to get this data readily
but I guess it depends a bit on how consistent the overheads are etc
(I'm doing some local perf runs and already finding the results very interesting)
also, reasonably consistent run to run
some small-ish variation
That's good to hear! I don't think we currently have any data about how much variability it introduces.