The other idea we had was to have a ICE-breaker around bisecting and reducing -- we need a good name for this -- but I definitely think we should do it. Leaving a thread here.
We did have an issue on it, I think
boy I really better finish writing the blog post I started drafting on this subject (on reducing in particular)
@nikomatsakis btw, what would the bisecting and reducing wg would be about exactly?
@Santiago Pastorino sometimes we have a bug report
that is like "something crashed in my repo"
the idea would be to have people who just try to turn that from "something crashed" to
"here is a self-contained playground example that broke because of PR #123"
I see
So I think an alternative could be to try and have a dedicated set of people (sort of like a standard working group) oriented around this thing, but I think I'd sort of prefer the "just ping a bunch of people and at least one or two of them get interested" and see if it works
yeah makes sense
Do they give access to their repos? If they're saying that something crashed and nothing else, then things will be... difficult.
we tend to ask for the source crate and the build/test invocation
(at the very least)
OK, please forgive my ignorance, but would it be possible to script git bisect
to do most of this work for you? Or is the goal to really minimize the output of git bisect
?
(as an example, i spent a while today hunting for some way to reproduce #65774, before giving up and closing the bug)
@Cem Karan oh, hold on: I think @nikomatsakis failed to provide context about what we are reducing and bisecting-over
The "reduction" under discussion here is not the git history (which is what git bisect
attacks)
the reduction is instead of the input source crate (or crates)
and likewise the bisection is over that source code
to try to reduce the source code to some minimal amount where the problem of interest still arises
@pnkfelix OK, thanks for the explanation, but if the crate or crates are also under git
, then could we do git bisect
on those crates to at least get a clue as to what chunk of code is causing the issue? I know it isn't ideal as different people have different qualities of committing, but it will pare down the amount of code that needs to be reviewed. Once that's done, then you can go in by hand to find the real problem area.
Or am I still missing the point???
yes, that is also a technique for identifying smoking guns
one that we are familiar with; after all, we use it ourself (for cargo-bisect-rustc
)
And its not unreasonable to include it under the techniques available for trying to identify the problem area
But even with that identified git commit in hand, the rustc
developers will still ask you to provide a mcve
and of course, while the diff from a single commit may be small, that diff is usually not sufficient the bug report. So you'd still need to figure out how best to reduce the original test (either by building up a new example from scratch, informed by the identified commit; or by reducing down the original crate source)
@pnkfelix :sweat_smile: I should have realized that you guys would already have thought of that!
Note that cargo-bisect-rustc
is tailored solely for bisecting the rustc
development history, not arbitrary crates. :smiley:
Yes, and I just thought of something else; the rust project has done a really, really good job of trying to ensure that PRs will always compile at the very least before they are accepted. Random crates out the wild may not be that clean, so git bisect
might not even be a good option.
OK, this is a wild spitballing type of idea, which is likely to be difficult to do, but...
Since rustc
has access to the complete AST, is it possible to selectively delete portions of the tree? If the crate compiles and the problem still exists, repeat the process, otherwise back out what you did, and try to delete a different portion. Over time, you'll have a minimal code base that you can test.
I'll be the first to admit that it would be difficult to implement, but it might be a starting point for a better idea...
no its not a bad idea at all. The blog post I'm in the process of authoring describes a set of transformations similar to that (though not always deleting)
for example, replacing function bodies with loop { }
is almost always "valid"
so I indeed have often mused about trying to implement a tool that mechanizes the search over these reducing transformations
or leveraging something like quickcheck to do the search
quickcheck/proptest would be a good way to go...
but I wanted to first document the transformations I find useful for reducing by hand
and let some one else worry about implementing the reduction tool
What about using unimplemented!()
instead of loop {}
? If a function with loop{}
is called, the program will just hang, which may make finding the bug harder.
Alternatively, if the function's return value implements Default
, you might just replace the function body with Default::default()
, which should always produce a valid, if unexpected, value.
for example, replacing function bodies with
loop { }
is almost always "valid"
(when one is debugging ICE's, that is)
sorry, I forgot to include that detail before
A lot of the bugs we are trying to reduce are solely compile-time issues
and so the runtime behavior is irrelevant and can be discarded entirely
OK, then you're right, it doesn't matter what goes in the function body.
Actually... what about procedural macros?
well the behavior of those may or may not need to be preserved, depending on the bug
so I'm not really talking about blindly replacing all method bodies and throwing up your hands if things go awry afterwards
(you cannot do such blind replacement anyway. Reason 1: you may need one or more bodies to reproduce the bug. Especially if impl Trait
in return position is involved. Reason 2: const fn
does not support loop { }
as body.)
so I'm not really talking about blindly replacing all method bodies and throwing up your hands if things go awry afterwards
Makes sense; so what you need to do is figure out what can be replaced, and what you can replace it with.
(you cannot do such blind replacement anyway. Reason 1: you may need one or more bodies to reproduce the bug. Especially if
impl Trait
in return position is involved. Reason 2:const fn
does not supportloop { }
as body.)
Does const fn
support unimplemented!()
?
Can't help with reason 1
As for not blindly replacing method bodies, let's take a look at that. One method of solving this problem is to do exactly that, randomly replacing method bodies and recompiling. One of two things will happen; either the compile will fail, or it will succeed. If it fails, then back out the changes, if it succeeds, then keep the body swap. This can continue until you have some minimal set of methods that you can't replace any more. At that point, you can start deleting methods/functions that have had their bodies replaced. This will further reduce the code. It may be enough to get to the root of the bug, or it may not, no way to know without trying.
I think we are in vigorous agreement?
By the way, a reason I choose to use loop { }
rather than unimplemented!()
is that it removes the dependence on that unimplemented!()
macro (and the underlying panic machinery).
For many bugs, this distinction doesn't matter.
But there are some bugs where you are working in a #![no_core]
scenario, and you do not have that macro available.
Does
const fn
supportunimplemented!()
?
No. Not yet at least.