Stream: t-compiler

Topic: categorizing crater runs


nikomatsakis (May 18 2020 at 17:50, on Zulip):

So @Pietro Albini was talking to me about what might be useful to "auto-categorize" crater runs and we thought we'd move the conversation here.

nikomatsakis (May 18 2020 at 17:50, on Zulip):

I'm still a tiny bit unclear on what the expectation here is -- are we going to generate a markdown file? A web page?

nikomatsakis (May 18 2020 at 17:50, on Zulip):

An interactive tool? :)

nikomatsakis (May 18 2020 at 17:51, on Zulip):

A web page would give us more room to produce different "views" onto the same data

Pietro Albini (May 18 2020 at 17:51, on Zulip):

for context, a coursemate of mine is working on a thesis on improving the crater reports

nikomatsakis (May 18 2020 at 17:51, on Zulip):

e.g., I'd love to have things categorized both by the dependency tree (so that you can see how many errors you get in each crate)

nikomatsakis (May 18 2020 at 17:51, on Zulip):

but also by errors

nikomatsakis (May 18 2020 at 17:51, on Zulip):

i.e., "this error message broke crates X, Y, and Z"

Pietro Albini (May 18 2020 at 17:51, on Zulip):

the current ideas is to improve the current report html

nikomatsakis (May 18 2020 at 17:51, on Zulip):

I'd probably also like some kind of markdown export that gives me a list of crates where I can edit and leave notes inline

nikomatsakis (May 18 2020 at 17:52, on Zulip):

Pietro Albini said:

the current ideas is to improve the current report html

I think that's the right starting point

nikomatsakis (May 18 2020 at 17:52, on Zulip):

cc @lqd and @Matthew Jasper, at least? I feel like these are two people I remember going through big crater runs recently, though I know many have done so

Pietro Albini (May 18 2020 at 17:52, on Zulip):

@simulacrum as well

Pietro Albini (May 18 2020 at 17:52, on Zulip):

(I'll try to get the person working on the thesis here btw)

nikomatsakis (May 18 2020 at 17:55, on Zulip):

I'm wondering if we can make...idk...a hackmd? something to sketch out what a sample report might look like

simulacrum (May 18 2020 at 17:55, on Zulip):

Yeah, so I think there's a few avenues to explore:

nikomatsakis (May 18 2020 at 17:55, on Zulip):

it's worth pointing out that for at least some crater runs

nikomatsakis (May 18 2020 at 17:55, on Zulip):

i.e., if we use a deny-by-default lint

nikomatsakis (May 18 2020 at 17:55, on Zulip):

then we are able to categorize dependencies independently from one another

nikomatsakis (May 18 2020 at 17:56, on Zulip):

so we don't, I think, want to just assume that if a root dependency is broken, then its children can be ignored (but maybe the user can tell us that?)

simulacrum (May 18 2020 at 17:56, on Zulip):

yeah, I'd imagine that we want to get user input on that

simulacrum (May 18 2020 at 17:57, on Zulip):

in the ideal world your tooling just automagically does that

simulacrum (May 18 2020 at 17:57, on Zulip):

e.g. my tool tries to, because it groups by crate name

simulacrum (May 18 2020 at 17:57, on Zulip):

specifically failed crate name, not the official one we were trying to compile

Pietro Albini (May 18 2020 at 18:03, on Zulip):

@Giacomo Pasini is the one making the thesis :wave:

Giacomo Pasini (May 18 2020 at 18:03, on Zulip):

Hello everyone :wave:

nikomatsakis (May 18 2020 at 18:04, on Zulip):

Hi =)

nikomatsakis (May 18 2020 at 18:04, on Zulip):

Super excited to hear you would be doing work on this!

nikomatsakis (May 18 2020 at 18:04, on Zulip):

simulacrum said:

specifically failed crate name, not the official one we were trying to compile

yes, this would be ideal

simulacrum (May 18 2020 at 18:06, on Zulip):

IIRC there's a way to get cargo to output both json and text, which would probably be good to start with -- currently parsing the logs can be pretty annoying

nikomatsakis (May 18 2020 at 18:06, on Zulip):

yeah

nikomatsakis (May 18 2020 at 18:06, on Zulip):

rust-analyzer uses this

nikomatsakis (May 18 2020 at 18:06, on Zulip):

well, maybe it uses only json

nikomatsakis (May 18 2020 at 18:06, on Zulip):

but in any case

Pietro Albini (May 18 2020 at 18:07, on Zulip):

the json includes the rendered text in one of its fields, so we can just use that

simulacrum (May 18 2020 at 18:07, on Zulip):

(You can render the text output faithfully from JSON, or at least you're supposed to be able to)

Giacomo Pasini (May 18 2020 at 18:07, on Zulip):

Pietro Albini said:

the json includes the rendered text in one of its fields, so we can just use that

Yes, that's what I'm doing right now

Giacomo Pasini (May 18 2020 at 18:08, on Zulip):

I tell cargo to produce json output and then I put the rendered message in the logs

Giacomo Pasini (May 18 2020 at 18:11, on Zulip):

simulacrum said:

Yeah, so I think there's a few avenues to explore:

also, at the moment I should be able to detect build error codes and dependency errors when testing crates

simulacrum (May 18 2020 at 18:11, on Zulip):

those both sound like amazing improvements already

Giacomo Pasini (May 18 2020 at 18:12, on Zulip):

Not test errors though, I think that it might be quite difficult, they all seem to be pretty different from one another

simulacrum (May 18 2020 at 18:12, on Zulip):

We probably want either two logs or perhaps interleaved JSON and rendered logs? Ideally I think the processing step would be built-in to crater but wouldn't need to happen during the run, since that means that as we try things out we need to wait a few days each time for crater to run

simulacrum (May 18 2020 at 18:13, on Zulip):

I would leave test errors out of scope for now, I don't think there's much you can do in an automated fashion with those

Giacomo Pasini (May 18 2020 at 18:17, on Zulip):

simulacrum said:

We probably want either two logs or perhaps interleaved JSON and rendered logs? Ideally I think the processing step would be built-in to crater but wouldn't need to happen during the run, since that means that as we try things out we need to wait a few days each time for crater to run

What is the purpose of having JSON output for the users?

simulacrum (May 18 2020 at 18:17, on Zulip):

well I'm more so saying for you (or whoever maintains the tooling you build)

simulacrum (May 18 2020 at 18:18, on Zulip):

it is much easier to work on this sort of thing if you can test stuff out on real data in minutes rather than days

Pietro Albini (May 18 2020 at 18:18, on Zulip):

well I expect all the work to have tests in minicrater

simulacrum (May 18 2020 at 18:18, on Zulip):

and since the tooling really wants JSON, not parsing rendered diagnostics, it's nice to have the JSON on hand

Pietro Albini (May 18 2020 at 18:18, on Zulip):

and those tests are usually relatively quick

simulacrum (May 18 2020 at 18:18, on Zulip):

@Pietro Albini sure I mean that when you notice a bug in some crater run and submit a fix in the ideal world we could also go and "regenerate" the rendered output for a crater run

Pietro Albini (May 18 2020 at 18:19, on Zulip):

ooh

simulacrum (May 18 2020 at 18:19, on Zulip):

(potentially, we don't even store the ASCII and render it on demand, but that might be expensive, not sure)

Pietro Albini (May 18 2020 at 18:19, on Zulip):

that sounds difficult given the current architecture

simulacrum (May 18 2020 at 18:20, on Zulip):

I mean rendering the json to ascii can probably even be done in JS, realistically

simulacrum (May 18 2020 at 18:21, on Zulip):

(doing something like the CI rendered links we have in rla)

Pietro Albini (May 18 2020 at 18:21, on Zulip):

hmm

simulacrum (May 18 2020 at 18:21, on Zulip):

doing ad-hoc queries is much easier on the JSON is basically my point

simulacrum (May 18 2020 at 18:21, on Zulip):

and the tool won't be perfect

Giacomo Pasini (May 18 2020 at 18:21, on Zulip):

simulacrum said:

Pietro Albini sure I mean that when you notice a bug in some crater run and submit a fix in the ideal world we could also go and "regenerate" the rendered output for a crater run

How do you usually fix bug in crater runs? I've never actually used crater so a lot of things are new for me

simulacrum (May 18 2020 at 18:22, on Zulip):

well I mean say -- for example -- we realize it would've been nice to group the results by some substring of the error message

simulacrum (May 18 2020 at 18:22, on Zulip):

if we have the JSON on hand, that's easy

simulacrum (May 18 2020 at 18:22, on Zulip):

if we don't... we basically need to rerun the whole thing

nikomatsakis (May 18 2020 at 18:28, on Zulip):

simulacrum said:

well I mean say -- for example -- we realize it would've been nice to group the results by some substring of the error message

yeah this is what I was saying to @Pietro Albini, I forget if we included it here-- that sometimes you'll find 2 or 3 related errors that all kind of mean the same thing

nikomatsakis (May 18 2020 at 18:28, on Zulip):

and it'd be nice to be able to "group" them

nikomatsakis (May 18 2020 at 18:28, on Zulip):

being able to regenerate the report would be amazing, but I guess it is hard?

Giacomo Pasini (May 18 2020 at 18:29, on Zulip):

I don't think it is as long as the results are still in the db

Giacomo Pasini (May 18 2020 at 18:29, on Zulip):

Not sure about how it is uploaded though

Pietro Albini (May 18 2020 at 18:30, on Zulip):

hmm, I'm not sure how feasible it is to regenerate the report

Pietro Albini (May 18 2020 at 18:30, on Zulip):

@nikomatsakis also mentioned to me being able to generate a markdown report containing the regressions, so it's easy to annotate it and copy/paste it on github

simulacrum (May 18 2020 at 18:31, on Zulip):

I'm basically saying that we don't actually store the ASCII anywhere -- if you want to look at it you go to some page with some JS which loads the JSON from S3 and renders that in the browser

simulacrum (May 18 2020 at 18:31, on Zulip):

then regenerating reports becomes a matter of running the tool against an archive

simulacrum (May 18 2020 at 18:32, on Zulip):

the report output is just a couple html files at most is what I'm envisioning

Giacomo Pasini (May 18 2020 at 18:34, on Zulip):

Pietro Albini said:

nikomatsakis also mentioned to me being able to generate a markdown report containing the regressions, so it's easy to annotate it and copy/paste it on github

Something like the regressed crate list? maybe with some kind of grouping by error code or other criteria?

Pietro Albini (May 18 2020 at 18:45, on Zulip):

yeah

Giacomo Pasini (May 19 2020 at 09:12, on Zulip):

So, if I understand correctly it would be good to have:

Giacomo Pasini (May 19 2020 at 09:14, on Zulip):

need to investigate the best way to do the last point

lqd (May 19 2020 at 14:45, on Zulip):

@nikomatsakis yeah I had seen this and the ideas seemed to match quite well the difficulties I've personally experienced in the past. (as you may know) both Mark and I have made scripts/tools to do some analysis on the root regressions parts of the problems (and I wanted to interactively categorize error groups as well but that was longer to do than the time it would have saved me, I think). I would say that executing on the plan above would already be a big help, and if anyone needs more ideas, the usual next step for the crater runs is reproducing and minimizing those regressions. That is also a harder task to automate but I've often had success with https://github.com/jethrogb/rust-reduce/ sometimes combined with c-reduce, all in all just removing a lot of unneeded code, even if the result is not the most minimal is already a great help

Giacomo Pasini (May 19 2020 at 15:23, on Zulip):

thank you, I'll take a look at those tools

Pietro Albini (May 20 2020 at 19:43, on Zulip):

also cc @RalfJ

RalfJ (May 20 2020 at 19:52, on Zulip):

What I've been doing by hand twice now is convert the list of regressions into a hackmd where we can collaboratively tick them off one by one after taking a look. unfortunately for my latest case the hackmd is too big if I put all 400 regressions in there.^^
but still, not having to do that manually would be nice. :)

Pietro Albini (May 20 2020 at 19:53, on Zulip):

@Giacomo Pasini ^

Giacomo Pasini (May 25 2020 at 12:20, on Zulip):

This morning I talked with @Pietro Albini and we made sort of a plan for future updates. Given that I do not have unlimited time for my thesis I will gradually implement new features in the following order but I may run out of time and some of them may not be developed soon (I won't have time right after the thesis as I will be working as an intern).

Giacomo Pasini (May 25 2020 at 12:22, on Zulip):

right now I have some questions about the visualization of build errors and dependencies in the report, I'll be back with further details about the last points when I will get there

Giacomo Pasini (May 25 2020 at 12:29, on Zulip):

I took a look at @simulacrum work on crater-generate-report and I think that might be a good starting point: for each root regression there's a list of the dependent crates and a brief desc of the error that have caused the regression in first place

simulacrum (May 25 2020 at 12:31, on Zulip):

Yep, I think that's a good place to start. Feel free to steal the code (should be MIT/APACHE already but if not should likely be not too hard to adjust as such). Happy to answer questions as well.

It would definitely benefit from JSON and I think I have some local changes I should push up as well.

Giacomo Pasini (May 25 2020 at 12:38, on Zulip):

Should I group root regressions with the same error code?
What about root regressions that fails with more than 1 error code? Let's suppose there's a crate "crate1" that has build errors A and B. Do you prefer to have the crate "replicated" in both categories like:
A (little desc):

B (little desc):

or to have more specific groups like:
A (little desc), B (little desc):

simulacrum (May 25 2020 at 12:52, on Zulip):

I think replicating makes sense, at least to start

simulacrum (May 25 2020 at 12:53, on Zulip):

though generally there's sort of two modes I guess -- one where you care less about what failed but more so want a general sense of how widely spread the breakage is

simulacrum (May 25 2020 at 12:53, on Zulip):

and that's useful for a general assessment of viability for some change

simulacrum (May 25 2020 at 12:53, on Zulip):

and then secondarily, you care about "what actually broke" where splitting makes sense

Giacomo Pasini (May 25 2020 at 13:16, on Zulip):

simulacrum said:

though generally there's sort of two modes I guess -- one where you care less about what failed but more so want a general sense of how widely spread the breakage is

so maybe in this case it would be useful for example just to list root regressed crates and dependent ones?

simulacrum (May 25 2020 at 13:36, on Zulip):

hm yeah maybe, not sure

nikomatsakis (May 26 2020 at 18:02, on Zulip):

side note -- one thing that came up just now was that it would be useful to know

nikomatsakis (May 26 2020 at 18:03, on Zulip):

when regressions are occurring in older versions of the crates

nikomatsakis (May 26 2020 at 18:03, on Zulip):

i.e., if this is a regression in 0.1.0 but there is a 1.0.0 available that works...that affects the decision

Pietro Albini (May 26 2020 at 18:03, on Zulip):

@Giacomo Pasini ^

Giacomo Pasini (May 26 2020 at 18:29, on Zulip):

hmm, maybe I can check at the time of the report creation if a new version has been released

lqd (May 26 2020 at 19:07, on Zulip):

I've done this in the past: call the https://crates.io/api/v1/crates/$identifier api to have information about whether the regressed version is the latest (or if there could be semver-compatible releases for example), how old it is (and whether it is active or inactive), and so on

Giacomo Pasini (May 26 2020 at 19:09, on Zulip):

oh perfect, thank you

Pietro Albini (May 26 2020 at 19:22, on Zulip):

btw, let's query the index instead

Pietro Albini (May 26 2020 at 19:22, on Zulip):

otherwise we risk hammering crates.io

Last update: Jun 04 2020 at 18:35UTC