Does this sound reasonable? I'm still a little fuzzy about when query implementations are allowed to access the HIR and other info that is only available locally. I think I need to implement something in
rmeta and ensure that
associated_items gets called for each item before we finish compiling the local crate. Is this correct?
cc @Matthew Jasper who has answered my questions about the query system in the past.
As an aside, I noticed this while inspecting perf results for the example in #68957. After making name lookup, which used to take 60% of the CPU,
O(log n) in #69072,
associated_item became hot, accounting for 25% of CPU time.
@simulacrum What do you think about adding a benchmark to rustc-perf that has a lot of associated items? I would understand if the answer is no considering the current backlog on the perf server coupled with the time it took before someone realized this was slow.
Sounds like a good idea! I've added
associated_items as a stopgap cache for some even more pathological behavior, but didn't realize that the work it was doing was still quadratic.
FWIW I was looking into rustc performance on generated crates used in the embedded ecosystem and plan to add one to perf as well. That's what inspired the
associated_items query addition and my coherence perf improvements.
@ecstatic-morse I think it's a great idea! The benchmark should take no longer than 2 seconds, approximately, on current master.
@Jonas Schievink Did you do that relatively recently? Like the last month or so? If so, you probably fixed the majority of #68957, which has gotten a lot faster of late.
Yep, I think so too :)
@simulacrum Is that for a check build or an opt build?
I imagine it won't matter much as the benchmark should likely not generate that much LLVM IR