Stream: t-compiler/wg-self-profile

Topic: events as exported by measureme


mw (Mar 07 2019 at 16:20, on Zulip):

so, we like to discuss how events should be presented to post processing by the measureme crate

mw (Mar 07 2019 at 16:21, on Zulip):

i.e. that is the format that analysis happens on

mw (Mar 07 2019 at 16:22, on Zulip):

in this form one should not have to deal with the low-level encoding of the raw format as produced by the compiler

Wesley Wiser (Mar 07 2019 at 16:22, on Zulip):

Seems like there's maybe a efficient, low-level format that's used for serialization and a higher-level "nicer" event format that can be used for analysis

Wesley Wiser (Mar 07 2019 at 16:22, on Zulip):

Yes

mw (Mar 07 2019 at 16:22, on Zulip):

yes, exactly

mw (Mar 07 2019 at 16:22, on Zulip):

I see (at least) two ways of doing this

mw (Mar 07 2019 at 16:23, on Zulip):

1. define a struct Event with the appropriate fields

mw (Mar 07 2019 at 16:23, on Zulip):

2. define a callback interface like: iter_events(f: Fn(event_name, thread_id, time_stamp, ...))

mw (Mar 07 2019 at 16:24, on Zulip):

the struct might be more flexible?

Wesley Wiser (Mar 07 2019 at 16:25, on Zulip):

You could also have a struct Event with the raw fields and serialization data and then an enum Event which is basically what we have now in the compiler

mw (Mar 07 2019 at 16:25, on Zulip):

anyway, for both approaches we have to define the "fields"...

Wesley Wiser (Mar 07 2019 at 16:25, on Zulip):

Then provide an function to get an Iterator<Item=enum Event> over the raw struct data

mw (Mar 07 2019 at 16:26, on Zulip):

I kind of like this: https://github.com/rust-lang/rust/issues/58372#issuecomment-468238097

mw (Mar 07 2019 at 16:26, on Zulip):

except event_id would be a string version

mw (Mar 07 2019 at 16:27, on Zulip):

of the query key + args

Wesley Wiser (Mar 07 2019 at 16:27, on Zulip):

Like actually a str or a string id into the StringTable?

mw (Mar 07 2019 at 16:28, on Zulip):

something that can be converted to a &str

Wesley Wiser (Mar 07 2019 at 16:28, on Zulip):

Oh I see

Wesley Wiser (Mar 07 2019 at 16:28, on Zulip):

Next bullet point :)

mw (Mar 07 2019 at 16:29, on Zulip):

also, maybe other things like event_kind should be a string too?

mw (Mar 07 2019 at 16:29, on Zulip):

in order to be able to add new kinds later?

mw (Mar 07 2019 at 16:30, on Zulip):

I'm not sure if that would be necessary

Wesley Wiser (Mar 07 2019 at 16:30, on Zulip):

I think we may want to have a label field on every event

mw (Mar 07 2019 at 16:31, on Zulip):

which would contain what information?

mw (Mar 07 2019 at 16:31, on Zulip):

the query + key?

Wesley Wiser (Mar 07 2019 at 16:31, on Zulip):

Queries - query name
"generic activity" - some kind of associated data

mw (Mar 07 2019 at 16:31, on Zulip):

just the name or also the key/args?

Wesley Wiser (Mar 07 2019 at 16:32, on Zulip):

I know we can get at the query name from the event data but I think it's important semantically to have this so that the tool doesn't have to have super deep knowledge of rustc to interpret the results

Wesley Wiser (Mar 07 2019 at 16:32, on Zulip):

For the summarization tool, the label is the thing we'd aggregate on

mw (Mar 07 2019 at 16:33, on Zulip):

ok, so there would be the label and a separate field for the arguments?

Wesley Wiser (Mar 07 2019 at 16:33, on Zulip):

Yeah

mw (Mar 07 2019 at 16:33, on Zulip):

ok, sounds good

Wesley Wiser (Mar 07 2019 at 16:34, on Zulip):
struct ProfilerEvent {
  event_kind: EventKind, // u8 - query-provider, query-cache-hit, generic-event, incr-comp-cache-loading ...
  timestamp_kind: u8, // start, stop, instant
  timestamp: u64, // nanoseconds since profiler was created
  label: StringTableId,
  additional_data: [StringTableId]
}
Wesley Wiser (Mar 07 2019 at 16:35, on Zulip):

I guess you were saying event_id is also a StringTableId?

mw (Mar 07 2019 at 16:35, on Zulip):

I think event_id would go away

mw (Mar 07 2019 at 16:35, on Zulip):

because it's replaced by the label

mw (Mar 07 2019 at 16:35, on Zulip):

plus an additional list of arguments

Wesley Wiser (Mar 07 2019 at 16:36, on Zulip):

Ok yes

mw (Mar 07 2019 at 16:36, on Zulip):

measureme would take care of decoding the event_id string into label and args

Wesley Wiser (Mar 07 2019 at 16:37, on Zulip):

Oh ok

mw (Mar 07 2019 at 16:37, on Zulip):

and I don't think it would be a StringTableId because there might not be a single string table entry for the label

mw (Mar 07 2019 at 16:37, on Zulip):

although in practice there probably would be

Wesley Wiser (Mar 07 2019 at 16:37, on Zulip):

So this is the on-disk binary format:

struct ProfilerEvent {
  event_kind: EventKind, // u8 - query-provider, query-cache-hit, generic-event, incr-comp-cache-loading ...
  event_id: u32, // ~ (query-kind, query-key) or (function-name, arguments)
  timestamp_kind: u8, // start, stop, instant
  timestamp: u64 // nanoseconds since profiler was created
}
mw (Mar 07 2019 at 16:38, on Zulip):

yes

Wesley Wiser (Mar 07 2019 at 16:39, on Zulip):

But conceptually, its something like:

struct Event {
  event_kind: &str,
  label: &str,
  additional_data: &[str]
  timestamp: Instant,
  event_kind: (Start | Stop | Instant)
}
mw (Mar 07 2019 at 16:39, on Zulip):

yes

Wesley Wiser (Mar 07 2019 at 16:39, on Zulip):

Got it

Wesley Wiser (Mar 07 2019 at 16:39, on Zulip):

I really like that

Wesley Wiser (Mar 07 2019 at 16:40, on Zulip):

The profiling tools don't really have to know anything about rustc then. And eventually they can be generalized for salsa or whatever

mw (Mar 07 2019 at 16:40, on Zulip):

Instead of &str it would probably be something else with a fn as_str() -> Cow<str> method

Wesley Wiser (Mar 07 2019 at 16:40, on Zulip):

Ok

mw (Mar 07 2019 at 16:40, on Zulip):

then we can do something more efficient in some cases maybe

mw (Mar 07 2019 at 16:41, on Zulip):

but that's only a superficial difference

Wesley Wiser (Mar 07 2019 at 16:41, on Zulip):

Makes sense

mw (Mar 07 2019 at 16:41, on Zulip):

I wonder if event_kind should be a string or an enum

mw (Mar 07 2019 at 16:42, on Zulip):

string is more flexible

Wesley Wiser (Mar 07 2019 at 16:42, on Zulip):

Yeah, string makes more sense to me

mw (Mar 07 2019 at 16:42, on Zulip):

alright, let's do that then

Wesley Wiser (Mar 07 2019 at 16:42, on Zulip):

Then we don't have to go add a bunch of code to the tools when we add new events

mw (Mar 07 2019 at 16:43, on Zulip):

yes

Wesley Wiser (Mar 07 2019 at 16:43, on Zulip):

When I added the events for parallel query blocking, I thought it might be nice to have some kind of way to indicate semantically that the event is measuring something "bad"

Wesley Wiser (Mar 07 2019 at 16:43, on Zulip):

ie "overhead"

Wesley Wiser (Mar 07 2019 at 16:43, on Zulip):

The tools could then do things with that data as appropriate

mw (Mar 07 2019 at 16:44, on Zulip):

yeah, but that's probably a categorization that should be done during postprocessing

Wesley Wiser (Mar 07 2019 at 16:44, on Zulip):

The summarization tool could count up all of time spent in the overhead events and report that as an piece of data

Wesley Wiser (Mar 07 2019 at 16:44, on Zulip):

Yeah, we could do it that way.

mw (Mar 07 2019 at 16:45, on Zulip):

thread_id is another field we'll need

Wesley Wiser (Mar 07 2019 at 16:45, on Zulip):

Yeah

Wesley Wiser (Mar 07 2019 at 16:46, on Zulip):

That's just a u64 or even a u16 or something if we want to save the space

mw (Mar 07 2019 at 16:46, on Zulip):

so, measureme would then provide something like a ProfileData struct

mw (Mar 07 2019 at 16:47, on Zulip):

that has a fn events() -> impl Iterator<Item=Event> method ...

mw (Mar 07 2019 at 16:47, on Zulip):

yeah, thread_id could be an integer

mw (Mar 07 2019 at 16:47, on Zulip):

u64 seems fine

mw (Mar 07 2019 at 16:48, on Zulip):

ok, I like that so far

Wesley Wiser (Mar 07 2019 at 16:48, on Zulip):

That all seems great to me

Wesley Wiser (Mar 07 2019 at 16:48, on Zulip):

I've got to run in a few minutes

Wesley Wiser (Mar 07 2019 at 16:48, on Zulip):

I think this is plenty for me to get started though :)

mw (Mar 07 2019 at 16:48, on Zulip):

ok, great

Wesley Wiser (Mar 07 2019 at 16:48, on Zulip):

(not to cut you off if you have more you want to talk about)

mw (Mar 07 2019 at 16:49, on Zulip):

hopefully we can actually create the measureme GH repo soon :)

Wesley Wiser (Mar 07 2019 at 16:49, on Zulip):

Yeah

mw (Mar 07 2019 at 16:50, on Zulip):

I think I'm good for now

Wesley Wiser (Mar 07 2019 at 16:50, on Zulip):

It looks like there's still some process discussions?

mw (Mar 07 2019 at 16:50, on Zulip):

yeah

Wesley Wiser (Mar 07 2019 at 16:51, on Zulip):

Ok. Well have a great three-day weekend! :smile:

mw (Mar 07 2019 at 16:51, on Zulip):

thanks!

mw (Mar 07 2019 at 16:51, on Zulip):

have a great weekend too!

Last update: Nov 15 2019 at 20:05UTC