I did some testing with the latest versions of compiler and
for style-servo, the
events file is ~140 megabytes
running summarize on it takes 0.65 seconds on my machine
I wonder if there's any value in compressing the events file?
yeah, seems pretty reasonable
depends on whether you want to store it somewhere, I'd say
at the moment, the plan is not to store it, right?
No, I don't believe we're going to store it beyond the time required to process it
another question would be when to compress it
i.e. we don't want to do that in the rustc process
because we are going to great lengths to keep the overhead low already
Yeah, that's true
I seem to recall reading somewhere about a Linux filesystem that always compresses data using a really simple algorithm (zlib maybe?). The reason being that the CPU cost to compress the data was less than the cost of reading/writing more data off disk
But since we're writing the data async with mmap, it probably doesn't pay off for us
I think ZFS has the option of lz4 compressing data natively
Ah, yes that's probably what I'm thinking of https://blogs.oracle.com/solaris/zfs-compression-a-win-win-v2
But since we don't wait for the writes to happen or read the data back in process, it probably won't pay off for us
we would need to experiment but I'd like to keep it simple if possible
stackcollapse is even faster than
summarize: 0.48 secs
If we did want to keep the files around, we could definitely compress them after the perf run. I've seen pretty significant space reductions just from the native compression tool in macOS
crox is slower but still acceptable: 3.14 secs
yes, compressing the file gives me:
lz4: 41 MB
gzip: 22.4 MB
I think this will likely improve
crox speed https://github.com/rust-lang/measureme/issues/47
xz: 15.4 MB
And cut the output file size by half
yes, I also suspect that most of the time is spent generating the string data
since summarize also looks at all the data, but is much faster
one thing I noticed here: having only the PID, but not the crate name in the file name is annoying
I've been generating events for all crates in the crate graph
Yeah, there's likely some low hanging fruit in optimizing most of the processing tools
I haven't done any profiling of any of them
wow, the crox output is 550 MB
That's probably going to break Chrome lol
At a certain point, the Chrome dev tool just gives up and doesn't display anything.
There's an internal Chrome tool which handles large files better but the UI is a bit clunky
I'd say it is unlikely that we generate the Chrome profiling data eagerly for each perf run
given how big the files are
Yeah even relatively small crates like regex generate large files
(I believe chrome://tracing is the internal tool I'm thinking of but I don't have a profiler file handy to test)
The Firefox format looks like it scales much better to me but I haven't had a chance to play around with it
well, not a priority for the MVP anyway, I'd say
are you working on something?
It's been a busy week for me but I think I'll have time over the next few days to work on getting perf.rlo to run
-Z self-profile as we discussed
cool, I'll look into adding a path argument to -Zself-profile
if that doesn't clash with what you're doing
That should be easy to tweak later
Ok, so I've modified rustc-perf to run with
-Z self-profile and dump all of the files in a folder. Executing a full perf run results in ~1400 files totaling 4.4 gb. Note: these are the raw profile files, not processed results file.
@Wesley Wiser We'll probably want to process the data -- if you get the bit of code which correctly finds/loads the selected crate's into measureme that'd be a good start
I'm not sure how that should be done though
Did we decide which results we should keep? Should I just keep the first iteration's results for now?
we wanted some form of "aggregation" but that's not really the MVP
I'll just take the first one for now unless there's any objections
sure seems fine -- I imagine it's just a call to
first or something like that at a relatively top-level place
I would try to avoid making that decision in a lowlevel place if possible to make it easier to aggregate later (i.e., keep all the data around for now, right up until the end)
but it's not critical that happens so don't worry about it if its hard
Can I assume that the measureme tools are going to be installed in a specific folder? Or available on the path?
Basically, how do I know where to look to call the processing tools?
@simulacrum Do you have any suggestions about how to call the tools?
hm, so there was some discussion about inclusion into the sysroot, but that seems far off
do we expect to need multiple versions or will latest version pretty much work?
I think at this point, we should just expect the latest version to work
Of course, how we get the latest version is another question
But a recent-ish version should be fine
let's just assume it's installed somewhere in PATH
(Same as we do for perf, valgrind, other tools)
I guess there's a server somewhere you ssh into and install stuff as needed?
and then add to the benchmarking readme (IIRC, collector/README, or collector/benchmarks/README) some text about installing measureme
Yeah, we have two -- collection and site -- I presume the tools are needed on collection?
yeah, I'll be able to install them (can even stick a cargo install -f measureme in the script so that it reinstalls latest version every runthrough)