Stream: general

Topic: generate and execute wasm on the fly

zeroexcuses (May 16 2020 at 15:11, on Zulip):

Is there any Rust crate that makes it possible to (inside a Rust compiled to wasm application) to generate wasm and execute it on the fly?

Josh Triplett (May 16 2020 at 15:50, on Zulip):

WebAssembly doesn't allow dynamic code generation like that. There has occasionally been discussion of the ability to do so, to support things like JITs, but currently the standard doesn't allow it.

Josh Triplett (May 16 2020 at 15:51, on Zulip):

What goal do you have, that doing so would have helped with? Perhaps there's another way to accomplish it?

zeroexcuses (May 16 2020 at 16:46, on Zulip):

@Josh Triplett : Thanks. This is really helpful. Can you point me to the wasm standard discussions on dynamic code generation + execution vs no ?

My particular use case involves an APL/J/K interpreter. There are some operations where one can do 'stream fusion' type operations to reduce the amount of memory bandwidth one uses (but at the cost of more complicated 'kernels' running per 'cell'). Right now, in these 'kernels', there is a giant switch statement that is executed 'per instruction' ... whereas I would prefer to just run through the giant switch once, convert it to wasm/wast, and then execute it without the overhead of "one switch statement per instruction"

Josh Triplett (May 16 2020 at 17:51, on Zulip):

@zeroexcuses I don't know, offhand, of a reference to point you to. However, if you're interested in adding such capabilities, you might talk to the Bytecode Alliance (disclaimer: I'm a member), which works on software and interfaces for WebAssembly in various environments (including non-browser environments). I could imagine pursuing, through there, an interface for providing "dynamically generated WASM modules" and being able to link them in on the fly, the equivalent of dlopen. That might help.

Roland Kuhn (May 17 2020 at 06:44, on Zulip):

This is a very interesting discussion! I’m also interested in exploring my own small language for data transformations, which in the end shall be executed from native Rust code (no WASM) with good efficiency. Bundling rustc and using dlopen is a possibility, but this thread made me wonder whether there are more light-weight alternatives.

zeroexcuses (May 18 2020 at 00:49, on Zulip):

@Roland Kuhn : Are you able to get around the issue of "one match per instruction executed" ? This is the problem I am running into with a pure rust solution:

  1. I need to define some "pub enum VMInstr { ... }" , which is the set of 'primitive instructions'
  2. whatever higher level ops gets 'compiled down' to VMInstr
  3. the problem is taht when executing a Vec<VMInstr>, we have to do a match on evern VMInstr, per instruction executed. I have not benchmarked this, but this seems horribly inefficient.
Roland Kuhn (May 18 2020 at 06:10, on Zulip):

@zeroexcuses I should have been a bit clearer: I’m not yet actively pursuing this, in part because other things need to get done first and in part because I have not yet been able to think of a suitable design. A long time ago I wrote an expression evaluator in Pascal, which had the same issue you described; the solution back then was to include one assembler-written function that can just call a raw function pointer. With all the Rust infrastructure, maybe this can be simplified to compiling to a Vec<Box<dyn VMInstr>> (i.e. going from an enum to a trait), trading the huge switch for one memory indirection, which should improve performance in the “huge enough switch” case.

My question was aimed at whether there is some other tooling we could use to not only get rid of either overhead, but also apply optimizations suitable for the hardware that executes all this — where running rustc is a (very heavy) solution AFAICS.

Laurențiu Nicola (May 18 2020 at 06:21, on Zulip):

"Computed goto" can help with that, but I don't think it's possible in Rust. Anyway, a big switch is what most interpreted languages do and it works fine for them. If you want better performance, you can add more primitives to it (e.g. array/vectorized operations).

Last update: Jun 05 2020 at 22:40UTC