Is there any Rust crate that makes it possible to (inside a Rust compiled to wasm application) to generate wasm and execute it on the fly?
WebAssembly doesn't allow dynamic code generation like that. There has occasionally been discussion of the ability to do so, to support things like JITs, but currently the standard doesn't allow it.
What goal do you have, that doing so would have helped with? Perhaps there's another way to accomplish it?
@Josh Triplett : Thanks. This is really helpful. Can you point me to the wasm standard discussions on dynamic code generation + execution vs no ?
My particular use case involves an APL/J/K interpreter. There are some operations where one can do 'stream fusion' type operations to reduce the amount of memory bandwidth one uses (but at the cost of more complicated 'kernels' running per 'cell'). Right now, in these 'kernels', there is a giant switch statement that is executed 'per instruction' ... whereas I would prefer to just run through the giant switch once, convert it to wasm/wast, and then execute it without the overhead of "one switch statement per instruction"
@zeroexcuses I don't know, offhand, of a reference to point you to. However, if you're interested in adding such capabilities, you might talk to the Bytecode Alliance (disclaimer: I'm a member), which works on software and interfaces for WebAssembly in various environments (including non-browser environments). I could imagine pursuing, through there, an interface for providing "dynamically generated WASM modules" and being able to link them in on the fly, the equivalent of
dlopen. That might help.
This is a very interesting discussion! I’m also interested in exploring my own small language for data transformations, which in the end shall be executed from native Rust code (no WASM) with good efficiency. Bundling rustc and using
dlopen is a possibility, but this thread made me wonder whether there are more light-weight alternatives.
@Roland Kuhn : Are you able to get around the issue of "one match per instruction executed" ? This is the problem I am running into with a pure rust solution:
Vec<VMInstr>, we have to do a match on evern VMInstr, per instruction executed. I have not benchmarked this, but this seems horribly inefficient.
@zeroexcuses I should have been a bit clearer: I’m not yet actively pursuing this, in part because other things need to get done first and in part because I have not yet been able to think of a suitable design. A long time ago I wrote an expression evaluator in Pascal, which had the same issue you described; the solution back then was to include one assembler-written function that can just call a raw function pointer. With all the Rust infrastructure, maybe this can be simplified to compiling to a
Vec<Box<dyn VMInstr>> (i.e. going from an enum to a trait), trading the huge switch for one memory indirection, which should improve performance in the “huge enough switch” case.
My question was aimed at whether there is some other tooling we could use to not only get rid of either overhead, but also apply optimizations suitable for the hardware that executes all this — where running rustc is a (very heavy) solution AFAICS.
"Computed goto" can help with that, but I don't think it's possible in Rust. Anyway, a big switch is what most interpreted languages do and it works fine for them. If you want better performance, you can add more primitives to it (e.g. array/vectorized operations).