Stream: t-compiler/wg-self-profile

Topic: query keys ...


mw (Oct 23 2019 at 15:10, on Zulip):

I've been looking into implementing query key recording a little more concretely just now and it seems that my initial plan does not quite work :grumpy:

mw (Oct 23 2019 at 15:11, on Zulip):

I planned to use the DepNodeIndex corresponding to each query invocation as the StringId for which we'd then allocate string representations of the keys in bulk

mw (Oct 23 2019 at 15:12, on Zulip):

this is nice because it allows to also have query keys on dep-tracking events (once we add them) without additional overhead

mw (Oct 23 2019 at 15:13, on Zulip):

BUT: it turns out that the DepNodeIndex for a given query invocation is assigned only after the query provider has finished

mw (Oct 23 2019 at 15:13, on Zulip):

while we obviously want to start recording before invoking the query provider

mw (Oct 23 2019 at 15:14, on Zulip):

so we don't know which StringId to stick into the start event

Wesley Wiser (Oct 23 2019 at 15:14, on Zulip):

But we'd have the data when we go to record the end event?

mw (Oct 23 2019 at 15:15, on Zulip):

yes

Wesley Wiser (Oct 23 2019 at 15:15, on Zulip):

I mean we already depend heavily on the start/stop events being balanced...

mw (Oct 23 2019 at 15:16, on Zulip):

yes, it has been suggested in the past to have only one (start, end) event

mw (Oct 23 2019 at 15:16, on Zulip):

instead of two events

mw (Oct 23 2019 at 15:16, on Zulip):

that would solve the problem and also use less space

mw (Oct 23 2019 at 15:17, on Zulip):

the only problem I see with that is event nesting

mw (Oct 23 2019 at 15:17, on Zulip):

i.e. we want to be able to reconstruct "event stacks"

mw (Oct 23 2019 at 15:18, on Zulip):

and solely relying on time-stamps might not quite work (right?)

Wesley Wiser (Oct 23 2019 at 15:18, on Zulip):

I think it should because libstd has been patched to force monotonic clocks

mw (Oct 23 2019 at 15:19, on Zulip):

it has?

Wesley Wiser (Oct 23 2019 at 15:19, on Zulip):

Yeah

mw (Oct 23 2019 at 15:19, on Zulip):

it might just work then :)

Wesley Wiser (Oct 23 2019 at 15:19, on Zulip):

Pretty sure

Wesley Wiser (Oct 23 2019 at 15:19, on Zulip):

/me looking

mw (Oct 23 2019 at 15:19, on Zulip):

I remember the issue but not the outcome

Wesley Wiser (Oct 23 2019 at 15:20, on Zulip):

https://github.com/rust-lang/rust/pull/56988

mw (Oct 23 2019 at 15:21, on Zulip):

(wow 2018... time flies)

Wesley Wiser (Oct 23 2019 at 15:23, on Zulip):

well only if you're x86_64-linux, otherwise it just jumps forward sporadically :)

mw (Oct 23 2019 at 15:24, on Zulip):

:joy:

mw (Oct 23 2019 at 15:25, on Zulip):

alright, so I guess I'll take a stab at implementing the single-event approach?

Wesley Wiser (Oct 23 2019 at 15:25, on Zulip):

I guess you're going to need to mess with the RawEvent layout?

mw (Oct 23 2019 at 15:25, on Zulip):

it will mean updating quite a few of the tools

mw (Oct 23 2019 at 15:26, on Zulip):

I'm not sure if we can retain the iterator interface

Wesley Wiser (Oct 23 2019 at 15:26, on Zulip):

I think most of them actually use iter_matching_events() so it probably won't be that bad

mw (Oct 23 2019 at 15:26, on Zulip):

that's true

mw (Oct 23 2019 at 15:26, on Zulip):

I mean, I'm sure we could retain the interface

mw (Oct 23 2019 at 15:26, on Zulip):

I'm not sure we want to

Wesley Wiser (Oct 23 2019 at 15:27, on Zulip):

I think it's mostly going to be the same as what we already have if I understand what you want to do correctly

mw (Oct 23 2019 at 15:28, on Zulip):

well, Event has distinct start and end events

Wesley Wiser (Oct 23 2019 at 15:28, on Zulip):

Well I guess I'm thinking primary of the stuff that already uses MatchingEvent

mw (Oct 23 2019 at 15:28, on Zulip):

while the new RawEvent would not have those

Wesley Wiser (Oct 23 2019 at 15:28, on Zulip):

Yeah

mw (Oct 23 2019 at 15:29, on Zulip):

we could emulate the old behavior

mw (Oct 23 2019 at 15:29, on Zulip):

but that would mean reading the entire event stream into memory

Wesley Wiser (Oct 23 2019 at 15:29, on Zulip):

I think the new design might actually simplify summarize

Wesley Wiser (Oct 23 2019 at 15:29, on Zulip):

We do that already

mw (Oct 23 2019 at 15:29, on Zulip):

MatchingEvent would not be affect much, probably

Wesley Wiser (Oct 23 2019 at 15:29, on Zulip):

(I'm pretty sure)

mw (Oct 23 2019 at 15:30, on Zulip):

so, yeah

mw (Oct 23 2019 at 15:30, on Zulip):

I'll do some experimentation and show the result to you

Wesley Wiser (Oct 23 2019 at 15:30, on Zulip):

Ok

mw (Oct 23 2019 at 15:30, on Zulip):

and then we can discuss further

mw (Oct 23 2019 at 15:30, on Zulip):

it might be simpler than I thought

Wesley Wiser (Oct 23 2019 at 15:30, on Zulip):

I think summarize is really the only thing that's going to be invasive to change

mw (Oct 23 2019 at 15:31, on Zulip):

yeah

Wesley Wiser (Oct 23 2019 at 15:31, on Zulip):

And I think this may actually be good. The new approach lends pretty naturally to breaking the problem apart into a few discrete steps instead of mixing everything together in a massive loop

mw (Oct 23 2019 at 15:31, on Zulip):

yeah

mw (Oct 23 2019 at 15:32, on Zulip):

I guess flamegraphs will be affected too?

mw (Oct 23 2019 at 15:32, on Zulip):

they need the nesting information

Wesley Wiser (Oct 23 2019 at 15:32, on Zulip):

Yeah

Wesley Wiser (Oct 23 2019 at 15:33, on Zulip):

eddyb had a few thoughts they mentioned to me the other day about RawEvent, let me go find those in discord

Wesley Wiser (Oct 23 2019 at 15:34, on Zulip):

why are these two separate things? https://github.com/rust-lang/measureme/blob/d6440835f30cdc6cfbe8953711bcd7e703e98c79/measureme/src/raw_event.rs#L39-L40
if you merged them somehow and made thread_id just u32 then the whole thing would be 16 bytes
which you can divide a mmap page into
so now you get to write the entire record with one instruction on x86_64

Wesley Wiser (Oct 23 2019 at 15:34, on Zulip):

AFAIK, we don't actually use event_kind at all

Wesley Wiser (Oct 23 2019 at 15:34, on Zulip):

So we could drop that field "for free"

mw (Oct 23 2019 at 15:36, on Zulip):

yeah, maybe

mw (Oct 23 2019 at 15:37, on Zulip):

I'll try to reconstruct our reasoning for having two distinct fields

Wesley Wiser (Oct 23 2019 at 15:37, on Zulip):

originally we had the 'high-level' summary thing

Wesley Wiser (Oct 23 2019 at 15:37, on Zulip):

which just lumped everything together under really broad categories

Wesley Wiser (Oct 23 2019 at 15:38, on Zulip):

but that wasn't detailed enough so we made it per query and left the categories alone

Wesley Wiser (Oct 23 2019 at 15:38, on Zulip):

so you did -Zself-profile and got the high-level thing and if you added a -v on to it, you got the queries

Wesley Wiser (Oct 23 2019 at 15:39, on Zulip):

but everybody that wanted to use the tool ended up doing that anyway so it just became the default

Wesley Wiser (Oct 23 2019 at 15:39, on Zulip):

and we never actually implemented anything in measureme to handle event_kind

mw (Oct 23 2019 at 15:39, on Zulip):

right now we are using the event_kind to distinguish between generic activities vs queries etc

Wesley Wiser (Oct 23 2019 at 15:40, on Zulip):

ah, you're right

mw (Oct 23 2019 at 15:41, on Zulip):

one could lump that into the string contents of the event-id

mw (Oct 23 2019 at 15:42, on Zulip):

for query keys I already planned to make event-id conform to some simple grammar

Wesley Wiser (Oct 23 2019 at 15:42, on Zulip):

Actually, it's just query/activity vs incremental load time vs blocked time

mw (Oct 23 2019 at 15:42, on Zulip):

so that query-name and query-key can reliably be parsed out

Wesley Wiser (Oct 23 2019 at 15:43, on Zulip):

This is the only place I found we actually use it https://github.com/rust-lang/measureme/blob/34792560521a30030b2949b2c9dce881dea77852/summarize/src/analysis.rs#L84

mw (Oct 23 2019 at 15:43, on Zulip):

it is useful information, I'd say

Wesley Wiser (Oct 23 2019 at 15:43, on Zulip):

Yeah

Wesley Wiser (Oct 23 2019 at 15:44, on Zulip):

for query keys I already planned to make event-id conform to some simple grammar

This is a really good idea!

mw (Oct 23 2019 at 15:44, on Zulip):

but it could all be in one string, that's also true

mw (Oct 23 2019 at 15:44, on Zulip):

so if we already make the event-id complex in order to support query keys, we might as well add the event-kind to that too

Wesley Wiser (Oct 23 2019 at 15:45, on Zulip):

We're not currently using this stuff IIRC

https://github.com/rust-lang/measureme/blob/c8c27a9b17db7c863d4d59ecdbc26f3df20e2787/measureme/src/stringtable.rs#L7

mw (Oct 23 2019 at 15:45, on Zulip):

yes, that will be sorely needed soon though :)

mw (Oct 23 2019 at 15:46, on Zulip):

otherwise we'd replicate the query-names a billion times

Wesley Wiser (Oct 23 2019 at 15:46, on Zulip):

You could use it for the query key grammar as well

mw (Oct 23 2019 at 15:46, on Zulip):

yes, that's what I planned to use it for

Wesley Wiser (Oct 23 2019 at 15:48, on Zulip):

If there's anything I can do to help, feel free to ping me :)

mw (Oct 23 2019 at 15:48, on Zulip):

will do :)

mw (Oct 24 2019 at 11:10, on Zulip):

so, I started looking into this and it turns out that this works best if we do a number of refactorings (that we wanted to do anyway) first

mw (Oct 24 2019 at 11:10, on Zulip):
  1. move the RAII based API into measureme itself (i.e. the TimingGuard stuff)
mw (Oct 24 2019 at 11:11, on Zulip):

this just works better with only having a single event at the end because you need to remember the starting time somewhere

mw (Oct 24 2019 at 11:12, on Zulip):

well, it looks like this is the main thing that should be done :)

mw (Oct 24 2019 at 11:14, on Zulip):

other nice-to-have things that I'll also do:

mw (Oct 24 2019 at 12:36, on Zulip):

for some context: once https://github.com/rust-lang/measureme/pull/70 lands, I plan to immediately follow up with a rustc PR that uses 0.4.0 because it's straightforward and makes the code a bit cleaner in rustc. That should not require any changes to perf.rlo.

mw (Oct 24 2019 at 12:37, on Zulip):

the rest of the refactoring might take a little longer (a few days maybe) so it would be nice to land the improvements in the meantime.

mw (Oct 24 2019 at 12:37, on Zulip):

Alternatively we could not land anything just yet and wait for the refactoring to be completed. That would also be fine with me.

simulacrum (Oct 24 2019 at 12:39, on Zulip):

I think we should land things as we go

mw (Oct 31 2019 at 12:40, on Zulip):

I'm in the middle after refactoring events from start+stop to single interval events, and it's going fine so far.

mw (Oct 31 2019 at 12:40, on Zulip):

I'm trying update summarize right now. Do we have test cases for that somewhere?

mw (Oct 31 2019 at 12:41, on Zulip):

Or did we have plans on how to add test cases for it?

mw (Oct 31 2019 at 12:41, on Zulip):

^ @Wesley Wiser

mw (Oct 31 2019 at 12:42, on Zulip):

I seem to remember chatting about a text format that could be converted into a raw event stream, that would then be used for having tests with predictable time stamps

mw (Oct 31 2019 at 12:44, on Zulip):

A "builder" type that takes care of creating raw event streams for testing might be event better.

mw (Oct 31 2019 at 12:44, on Zulip):

because then we can have unit tests more easily and don't need to write a parser

mw (Oct 31 2019 at 12:49, on Zulip):

I think I need to add test cases before doing the refactoring, at least for summarize...

mw (Oct 31 2019 at 12:54, on Zulip):

(in case anyone is interested, my changes so far can be found at https://github.com/michaelwoerister/measureme/tree/interval-events-refactoring)

Wesley Wiser (Oct 31 2019 at 13:11, on Zulip):

I seem to remember chatting about a text format that could be converted into a raw event stream, that would then be used for having tests with predictable time stamps

I vaguely remember talking about this :slight_smile:

A "builder" type that takes care of creating raw event streams for testing might be event better.

This sounds really good! I like this idea a lot

mw (Oct 31 2019 at 13:18, on Zulip):

I'll probably look into that tomorrow

Last update: Nov 17 2019 at 07:45UTC