@WG-learning was hoping you could do a lecture about LLVM/codegen so we could add it to the rustc-guide
It is one of the biggest long-standing gaps in the guide
Alternately, we could do some sort of discussion here on zulip if you prefer that
like we did with macros/hygiene with petrochenkov:
I don’t have the means to record a lecture, sadly.
Also LLVM and MIR→LLVM-IR codegen are two very distinct things and would deserve their separate two lectures.
that's good to know
Would you be able to do something like the macros discussion here on zulip? (like the link above)
We also understand if you are not able to
I can write down some broad outline and point out where to look for more specifics. But also, again, codegen is very deceptively prone to look small. Do you have lectures on backend (linking, etc.)?
That would be an excellent start!
Unfortunately, we have nothing past the MIR opt
nothing on monomorphization, LLVM IR generation, and very little on LLVM itself
I have to go now. We would appreciate any information you can along in whatever format is most convenient for you. Please feel free to DM me
Not gonna get anything earlier than some weekend.
I've been talking with @Alex Crichton (can do some LLVM) and @oli who can do some of monomorphization and type layout
could be good to coordinate, what would you cover @nagisa ?
also, @mw do you have time and can cover some part?
there's also https://github.com/rust-lang/rust/pull/54012 some people from there may know and be able to do something
So, first, where do I see what we already have? I was planning to make a list of things that we should probably split into separate categories first.
starting with "monomorpization collector" and ending with "backend/linker invocation/lto".
There are few things I cannot cover in there, collector being one of those things.
This is all we have: https://rust-lang.github.io/rustc-guide/codegen.html
so any info you have would be wonderful
I guess we should create a placeholder chapter for each of the categories you come up with
https://gist.github.com/nagisa/a311a0dab09851397f266076130eefb6 some braindumping
I had a thread where I was pretty much asking about the same kind of lecture:
I think it'd be good to explain why monomorphisation occurs in the codegen, and not at the MIR level. This surprised me.
@Edd Barrett My understanding is that it sort of does happen at the MIR level. But first, we have to find out what to codegen (i.e. how do you codegen a
Vec<T> without first knowing what
T is?). So after the MIR is produced/checked/optimized, we monomorphize, then we generate code for the monomorphized versions.
@mark-i-m But you don't generate a new MIR body for each monomorphisation, do you?
Not sure, but I don't think so. My understanding is that we only collect a list of concrete instantiations of each generic MIR body. We then iterate through the list and generate code with that concrete instance. Not 100% sure though
That's in line with my understanding.
We figure out the necessary concrete instantiations as part of collector and codegen unit distribution (occurs right before we lower MIR into LLVM) and then for each such instantiation we run a separate lowering pass. Lowering then monomorphises various generic statements etc. within MIR as it goes.
So yes, the descriptions above seem fairly on-point :slight_smile:
Perhaps it is also good to mention why this decision has been made, because it is a tradeoff – we anticipate MIR to be optimised as far as it being generic allows first, so that we end up spending less time on lowering monomorphic forms and reducing the workload LLVM has to handle. That being said there’s nothing preventing us from having what @eddyb calls LIR which is essentially MIR but monomorphic.
Turns out the collector has some nice documentation https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir/monomorphize/collector/index.html already
It might also be useful to state the difference between "declaring" and "defining".