Stream: t-compiler/rust-analyzer

Topic: Live unit testing


eggyal (Nov 25 2020 at 13:21, on Zulip):

Not sure whether this is the right place to ask—if not please do point me elsewhere! Just wondering whether there are any efforts underway (or if not, whether it is on any roadmap) to implement something akin to "live unit testing" for Rust (where the IDE continuously updates visual indicators of unit test results AS ONE CODES: see for example https://wallabyjs.com/ and https://docs.microsoft.com/en-us/visualstudio/test/live-unit-testing )? Compilation time obviously a hinderance but perhaps moving toward a realistic possibility with https://github.com/bjorn3/rustc_codegen_cranelift/issues/1087 ... cc @bjorn3

Jonas Schievink [he/him] (Nov 25 2020 at 13:28, on Zulip):

Seems like this would require running the compiler on not-yet-saved files, aka VFS support

Jeremy Kolb (Nov 25 2020 at 13:29, on Zulip):

https://github.com/rust-analyzer/rust-analyzer/issues/3601

Jonas Schievink [he/him] (Nov 25 2020 at 13:30, on Zulip):

and that, yeah :)

eggyal (Nov 25 2020 at 13:31, on Zulip):

@Jonas Schievink Perhaps. Or else mirroring the source files in an on-disk cache to which modifications (not yet saved to the working tree) are stored and compiling that (I believe this is how Wallaby does it, via IDE plugins).

Jonas Schievink [he/him] (Nov 25 2020 at 13:31, on Zulip):

yeah, but that's pretty error-prone

bjorn3 (Nov 25 2020 at 13:33, on Zulip):

This is a nice idea! When using nightly rustc, VFS support is easy to implement. All you have to do is implement FileLoader and pass it to rustc_driver in a custom driver. This is pretty much what rls does as far as I know. As for the referenced cg_clif issue, I made some progress on lazy compilation. The same Cranelift changes necessary for this would enable hot code swapping from the Cranelift side. I don't know how easy it will be to abuse incremental compilation for this purpose though, but it is certainly high on my todo list.

Florian Diebold (Nov 25 2020 at 13:34, on Zulip):

we don't call rustc in-process, so it's not so easy :grimacing:

bjorn3 (Nov 25 2020 at 13:34, on Zulip):

If only rust-analyzer were to require nightly...

Jeremy Kolb (Nov 25 2020 at 13:36, on Zulip):

Maybe someone could take https://github.com/rust-analyzer/rust-analyzer/pull/5765 and push it over the edge

Jonas Schievink [he/him] (Nov 25 2020 at 13:38, on Zulip):

I don't see pinning r-a to nightly Rust versions as too big of a problem, as long as we only update the version when really necessary.

Jonas Schievink [he/him] (Nov 25 2020 at 13:39, on Zulip):

(I feel like we update dependencies way too frequently, my target dir is over 30 GB already)

eggyal (Nov 25 2020 at 13:40, on Zulip):

@Jeremy Kolb the test explorers are great but do they update live? I think this requires instrumentation in order to determine which tests have been affected by any given modifications, or else one will need to rerun the entire suite after every edit. Perhaps @Rich Kadel's work on https://github.com/rust-lang/rust/issues/79121 could help here

Jonas Schievink [he/him] (Nov 25 2020 at 13:40, on Zulip):

That would require pretty far-reaching code analysis

eggyal (Nov 25 2020 at 13:42, on Zulip):

@Jonas Schievink Or just run tests (with instrumentation) and record what code regions each has exercised/covered

Jeremy Kolb (Nov 25 2020 at 13:43, on Zulip):

I'm not sure but my guess is that it's probably not live.

Florian Diebold (Nov 25 2020 at 13:44, on Zulip):

I'd say the pieces for this are coming together, but we're probably still a few years away from being able to do this

eggyal (Nov 25 2020 at 13:47, on Zulip):

Sounds like a github issue sketching out the design and dependencies could be helpful?

Joshua Nelson (Nov 25 2020 at 13:48, on Zulip):

heh, I suggested something like this a while back: https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/Speculative.20CI.3F/near/200533333

Joshua Nelson (Nov 25 2020 at 13:50, on Zulip):

ugh, the official discord hid the #infra channel for some reason so you can't look at it anymore

Joshua Nelson (Nov 25 2020 at 13:50, on Zulip):

I had found a project doing this in python

Florian Diebold (Nov 25 2020 at 13:51, on Zulip):

@Joshua Nelson I think in the context of CI this doesn't really help. For running unit-tests live while editing code, it's plausible because we can compile everything, run all tests and collect coverage, and then know which tests cover which code (though even that would be relying on the tests being deterministic). But I don't think you can reliably do the analysis which tests depend on which code statically, so this wouldn't help you reduce CI times

Joshua Nelson (Nov 25 2020 at 13:51, on Zulip):

right, this is dynamic - you need state between CI runs

Joshua Nelson (Nov 25 2020 at 13:52, on Zulip):

to know which tests depend on which code, and also which files have been modified

Joshua Nelson (Nov 25 2020 at 13:54, on Zulip):

hmm, I didn't think about non-determinism

Florian Diebold (Nov 25 2020 at 13:54, on Zulip):

even then -- if your IDE live-testing doesn't run some test it should, it's a slight annoyance. If the CI doesn't run some test it should, that's a huge problem

bjorn3 (Nov 25 2020 at 13:55, on Zulip):

With lazy compilation instrumenting would be trivial. It wouldn't even be necessary to add instrumentation calls in the jitted code. Just record the function call when you lazily compile a function.

Florian Diebold (Nov 25 2020 at 13:55, on Zulip):

:thinking: how would that work if you're running multiple tests -- you wouldn't know that the second test also called the same function, would you?

bjorn3 (Nov 25 2020 at 13:57, on Zulip):

You could reset the GOT used for swapping all calls from the lazy compilation stub to the jitted code back to the lazy compilation stubs after every test and then as optimization keep a pointer to the jitted code in a side-table.

bjorn3 (Nov 25 2020 at 14:06, on Zulip):

Opened bjorn3/rustc_codegen_cranelift#1113 for the instrumentation.

eggyal (Nov 25 2020 at 14:45, on Zulip):

@bjorn3 I guess you can do it at the function level, but instrumenting code paths within each function provides much richer feedback: eg test coverage and failure paths

bjorn3 (Nov 25 2020 at 15:12, on Zulip):

@eggyal True, but it also has much higher overhead. When trying to use cg_clif as frontend for yorick a while ago, I had to instrument the start of each basic block. When the instrumentation function immediately returned without doing anything, the overhead was 15% in the instrumentation function. Which doesn't account for any pessimizations of regalloc. When instrumentation was disabled at runtime using a global flag, the overhead was 30% after writing inline asm to prevent cranelift from spilling registers clobbered in the enabled case. Maybe the overhead could be improved by being smarter about where to add the instrumentation calls. One instrumentation call at the top of each function may be more doable.

bjorn3 (Nov 25 2020 at 15:14, on Zulip):

Or maybe only instrument user code and keep all code in dependencies uninstrumented?

eggyal (Nov 25 2020 at 16:10, on Zulip):

Thanks @bjorn3, that's definitely significant. However I'd observe from using Wallaby that, after the initial run of the full test suite, one rarely sees more than a few unit tests invoked by any given edit... and, being unit tests, execution typically is of the order of milliseconds—so even a 30% uplift may not be very material in practice?

bjorn3 (Nov 25 2020 at 16:50, on Zulip):

@eggyal The overhead may not be very important in this case. Also thinking about it some more, the overhead can be significantly reduced in this case by simply having a global for every instrumentation point and writing directly to that global without doing a call each time.

matklad (Dec 01 2020 at 14:26, on Zulip):

I actually don't think we need any fancy techniques here

matklad (Dec 01 2020 at 14:26, on Zulip):

We can just re-use our checkOnSave infra, but with using cargo test instead of cargo check

matklad (Dec 01 2020 at 14:27, on Zulip):

I think the most work here is getting the UI bits (which are not part of standard LSP, and as such would require a bunch of custom code)

Joshua Nelson (Dec 01 2020 at 14:29, on Zulip):

@matklad that would rerun all the tests, right? We've been discussing how to only rerun changed tests.

Joshua Nelson (Dec 01 2020 at 14:29, on Zulip):

But I agree an opt-in way to rerun everything is good in the meantime, since incremental retesting is a bit of a research project

matklad (Dec 01 2020 at 14:31, on Zulip):

The discussion converged to that, but we don't need that to implement the original feature request

eggyal (Jan 19 2021 at 22:44, on Zulip):

Just revisiting this a bit... @matklad you mentioned most the work would be on UI as reporting test results is not in standard LSP, but VS does have Live Unit Testing for other languages so I guess MS have an API for it even if it’s not public/open? Is there any way we could find out more, rather than (as you say) reinvent the wheel (which appears to be Wallaby’s approach)?

matklad (Jan 19 2021 at 22:55, on Zulip):

I think just thoroughly googing around and reading the sources of other extensions would do the trick

David Barsky (Jan 20 2021 at 18:49, on Zulip):

(lurking). If I understand this discussion right, VSCode has an issue for standardizing the test interface: https://github.com/microsoft/vscode/issues/107467

David Barsky (Jan 20 2021 at 18:49, on Zulip):

(it's part of the January 2021 iteration plan: https://github.com/microsoft/vscode/issues/112419)

eggyal (Apr 08 2021 at 11:40, on Zulip):

My mind was just wandering over this once again, and on re-reading the thread I think it may be worth adding to what bjorn3 said:

simply having a global for every instrumentation point and writing directly to that global without doing a call each time.

I believe this is exactly how -Zinstrument-coverage works, albeit that the counters are incremented via LLVM intrinsics that obviously aren't available in cg-clif; furthermore that approach does not instrument every block as the counts for many can be calculated from those of others (eg in if ... else only one branch need be counted as the other is simply the difference between the parent and that counted one). Might it be worth adding similar instrumentation intrinsics to cg-clif? That feels like something I could take a crack at.

bjorn3 (Apr 08 2021 at 12:22, on Zulip):

@eggyal For determining which functions are used, only per-function instrumentation is necessary, not per-block instrumentation. I do see value in full -Zinstrument-coverage support (maybe even compatible with LLVM), but for now I think per-function coverage should be easier. I am happy with a PR for either option.

eggyal (Apr 08 2021 at 12:52, on Zulip):

Created https://github.com/rust-analyzer/rust-analyzer/issues/8420 to track this

Last update: Jul 28 2021 at 04:15UTC