So @tmandry and I were talking about how to improve the rustc-chalk integration
And we decided to continue here :)
Currently rustc only uses chalk-engine
and it duplicates all the lowering and other logic
I think what we want instead is that rustc uses chalk-solve
(Like rust-analyzer does)
This will mean that we have to implement the RustIrDatabase
trait
We're not going to be be able to do that yet without some work :)
I was hoping to kind of brainstorm out a bit what this work would be
just made a hackmd document to keep notes
anyway the real challenge then is going to be created little bits of chalk-ir and chalk-rust-ir from Rust's HIR/ty values
I expect to use the TypeFamily
traits (extended with some more stuff...) to try and make that convenient and efficient
to start, I think we need to add the "ids"
e.g., instead of TypeId
being hard-coded to be an integer (wrapped around RawId
)
this should probably indirect through TypeFamily
that would allow us to use DefId
in the compiler
(it's interesting to note that I think Chalk can have a richer notion of ids, all of which map to DefId
under the hood)
not sure if that'll cause any difficulty, depends if we currently match on the TypeKindId
enum for anything besides debug
(answer: we do, a bit, so we should add some methods around that)
for example, we have code like this
// If this is a `Foo: Send` (or any auto-trait), then add // the automatic impls for `Foo`. let trait_datum = db.trait_datum(trait_id); if trait_datum.is_auto_trait() { match trait_ref.parameters[0].assert_ty_ref().data() { TyData::Apply(apply) => { if let TypeName::TypeKindId(TypeKindId::StructId(struct_id)) = apply.name { push_auto_trait_impls(builder, trait_id, struct_id); } } TyData::InferenceVar(_) => { panic!("auto-traits should flounder if nothing is known") } _ => {} } }
(@tmandry, still here? :)
yep
apart from the above, I think there are some other mappings we want
e.g.
Parameter<TF>
=> Kind<'tcx>
in rustc
similarly, anything with an embedded Vec
or Box
seems like trouble
I am debating just a bit now -- I guess I take that (mildly) back, it might be that it's ok to make vectors for now, and then it's just perf optimization to improve it
well I do think we really want to try and map Parameter<'tcx>
to Kind<'tcx>
,
and see if we can (therefore) map a vector of substitutions to Substs<'tcx>
the reason is
this is what chalk gives back as answers to queries
I'd like it if we (basically) had some (potentially expensive) function that generates "data" from a Ty<'tcx>
for those times that we call ty.data()
this is basically doing the translation from rustc's representation to chalk's, and we can make that gradually cheaper
but ideally we wouldn't need such a translation on the way back
(along those lines, I see one potential problem with the existing type signatures...)
impl Ty<TF> { pub fn data(&self) -> &TyData<TF> { ... } }
if, in rustc, we are generating that TyData
on the fly...
it's going to be hard to return borrowed data
well
I guess we can allocate it from the arena
i.e., generate it lazilly (and only once)
I think I remember now that is what I had in mind
ok, looking a bit deeper
obviously the types in rustc have a lot more variants than their chalk counterparts
but most of them are effectively type names of various kinds
but ideally we wouldn't need such a translation on the way back
just making sure I follow.. in the converted type, we can just store a Ty<'tcx>
that points back to the rustc type, right?
assuming everything's stored in an arena with the same lifetime, it should work
Sorry, mm,
what do you mean by the "converted type"?
well, not sure =) Ty
is what I was thinking of
but we just have to implement things like ty_data()
and we can pass Ty<'tcx>
straight through chalk?
yes so
if you implemented ty_data
you convert one type to chalk's representation -- "up to" embedded types
i.e., if you have Vec<u32>
, that is two Ty<'tcx>
values
that is, the Vec
has a list (in rustc) of substitutions Substs<'tcx>
in chalk (today) we have a Vec<Parameter<TF>>
, and if you dig in a bit more, you'll see that Parameter<TF>
embeds a Ty<TF>
which is a newtype of TF::InternedTy
so the thing that rustc would have to create (if we made no further changes from what we have today) when you invoke ty_data
on the Vec<u32>
type would be these structures, and it would embed the Ty<'tcx>
representing u32
not sure if that is making sense
basically a Ty<TF>
maps directly to an interned type in rustc (a Ty<'tcx>
)
yeah, makes sense now
we don't really have to make any further changes I don't thnk
right now chalk gives back (as its answers) a Vec<Parameter<TF>>
, and we could construct a Substs<'tcx>
from that, with some work
but it seems like it'd be nicer if pushed that Vec
into the type family embedding
that said, looking a bit more, I think something we do have to do is to make TypeName
generic
I'm actually a bit torn on this
how this should be handled
the other is that TyKind
in rustc has a lot of variants
ah, interesting
yeah it's worth just looking at those and seeing how we would convert them to chalk's notion of TyData
seems like many of them woul dend up as ApplicationTy
yes the vast majority
the interesting one I hadn't thought about is GeneratorWitness
it is a kind of quantified type
seems like many of them would end up as
ApplicationTy
anyway this is exactly why I was saying that we probably want TypeName
to be generic or something
except that, eventually, I'd like that "list" of types to move from rustc into a (shared) type library
that is used by chalk and a (shared) type-checker and (hence) rustc + rust-analyzer
that type-checker probably cares more about these distinctions
in other words, we might want to sort of hard-code the list
into chalk's type-name
but I'm not sure yet
in terms of other interesting things...
it's worth thinking about refactorings we can do in rusc to make life a touch easier
@csmoe already did some by changing how Closure/Generator work (they take a plain SubstsRef<'tcx>
)
the GeneratorWitness
is sort of a pain
although it's really just an exists<'a> { .. .}
sort of type
we maybe want to add this to chalk
the other thing i'm pondering is -- types like fn()
in rustc are one type
but in chalk they are two
a ForAll
paired with an application
How does that matter?
well
right now every Ty<TF>
(in chalk) and Ty<'tcx>
(in rustc) have to have a 1-to-1 correspondence
that said
I think we should just modify chalk a bit here
instead of having QuantifiedTy
wrap exactly one type
it should be more like
struct QuantifiedType { name: QuantifiedTypeName, binders: usize, // number of bound regions parameters: Vec<Parameter<TF>> }
really name
and binders
kind of overlap and could be merged
the name
here would encode the various bits of information from the function signature
e.g., ABI etc
and the parameters
listing would be the input/output types
it's a sort of genearlization of chalk's type today but same basic idea -- from POV of chalk, two such types are equal if (a) their names are equal and (b) all the parameters are equal, after instantiating the binders appropriately
I'm a bit lost on the fn()
stuff -- what the two types represent in chalk, for example
huh yeah, why is there an application in fn()
?
(sorry, multiplexing a few threads)
so today in chalk we effectively have
T = forall<'a, ..> { T }
and I am (first) proposing we genearlize this to
T = forall<'a, ...> { T... }
i.e., you quantify over multiple types
and second, then to add just a bit of a name
why? well, basically because it's not hard to do so, and it lets us map directly to rust's types
rust basically doesn't have a "generic forall
" type
it has binders that are "connected" to specifiy sorts of types, such as fn
types
plausibly we could just change chalk's QuantifiedTy
to FnPointer
since that's really the only thing that uses it
that's effectively what I'm proposing but with more confusing names :)
I guess that's just better
one other thing: chalk uses debruijn indices somewhat differently from rustc
in rustc, a bound variable is identified by a debruijn index (that indentifies the binder) and an internal index (within that binder); in chalk I think we just have the latter. Not sure how much of a pain this will be to reconcile. Probably we should try to adapt one or the other.
one place the difference shows up is exactly fn
types --
in rustc they don't have to indicate how many lifetimes they introduce
e.g., for<'a, 'b> fn(&'a &'b u32)
in chalk is a QuantifiedTy
with num_binders: 2
but in rustc we don't know how many such regions are bound unless we walk the types within
not sure if that's good, just true
plausibly we could just change chalk's
QuantifiedTy
toFnPointer
does this make sense, @tmandry / @detrumi ?
I think so
hmm
ah, they're both needed to uniquely identify the type, right?
the debruijn index and the internal index
yes, both are needed
..but not in chalk?
well in chalk we have only one set of indices
let me give an example
for<'a, 'b> fn (&'a u32, &'b u32)
in chalk is like fn(&^1 u32, &^0 u32)
but in rustc it would be fn(&^0.1 u32, &0.0 u32)
where the ^x
in chalk means "bound value with index x
"
and in rustc ^x.y
means "the y
th type bound in binder with index x
"
right, and the num_binders
in chalk helps you keep track which index belongs to which binder?
in chalk yes
I think there is .. maybe one or two places we take advantage of this? but with the newer clause builder it's hopefully not all that hard to convert
I guess I don't know, I'd have to go look
I do think they should agree, it will make our lives easier
and it seems easier to change chalk than rustc :)
won't it be problematic to not know how many they bind though?
not sure
I mean you can always recover that information
by walking the type
I think what rustc tends to do
well, so I think a side-effect -- where chalk tends to have a signature like
fn substitute(parameters: &[ Parameter ])
in rustc, you instead have to have a closure that (lazilly) computes the value for a given bound thing
since you usually don't know how many there are in advance, and so you kind of generate the substitution as you go
(there is a helper for this)
it may be that we can interconvert, but it seems like it will be a bit tricky, and require a bit of state to do
basically to create a chalk type from a rustc type, you'd have to know how many items were in each enclosing binder
and rustc doesn't know
I guess the obvious final thing that chalk needs to handle to truly handle rustc is constants
(for array types)
rustc's Const
you mean?
yes, I mean we have this type in rustc:
Array(Ty<'tcx>, &'tcx Const<'tcx>),
this is an application type in chalk
but one of the arguments is a Const
this is clearly something we just have to add to chalk, a third kind ("constants") -- I dont' think it should be that difficult, but we'll have to experiment a bit to figure out how the interface around that should work (this is also a WIP in rustc)
see also this topic :) -- where I owe a reply or two...
ok, but I think that's it
it's .. not trivial, but nothing seems that hard to me
I can probably turn the notes in the hackmd into concrete issues
though I'm a bit torn on the debruijn indices one (though some part of me wants to rip them out from everywhere in favor of "names that never alias")
we didn't look at the chalk-rust-ir adaptations that are needed but that's ok
I think that will be much easier because
there isn't the whole "nesting" side of things
i.e., you generate (say) a chalk struct datum from the rustc queries
but we never have to go back etc
and they don't embed other things except via def-ids
Well, I have to run to something else, but that was kind of useful, hope it made sense to y'all -- any parting thoughts?
I guess the question is "when/how will this work happen"
I'm happy to try and create some issues, I think a lot of these refactors will be relatively simple, if grungy
(e.g., changing QuantifiedTy
to FnPointer
shouldn't be especially difficult..?)
was there anything else that needed to be done for fn pointers?
I think that suffices, though we'll have to include some extra info for stuff like ABI
you said they're one type in rustc and two in chalk, not sure how changing the name of QuantifiedTy
is going to change that, or if it matters
it's not just changing the name of the type, in other words
it's also making it have a few extra fields and (in particular) take a Vec<Parameter<TF>>
instead of a single Ty<TF>
ah
I'm still a bit torn on whether it should be called ForAllApply
or something or FnPointer
the latter seems... a lot more obvious :)
the only argument for the former is that GeneratorWitness
is basically ExistsApply
hmm that is kind of a decent-ish argument
but it's not a very strong one
sorry, I'm not being very clear
what I mean is
we have Apply
types that do not introduce binders
we could have a QuantifiedApply
or something that does
and (depending on the name
field) that might be a fn()
type or it might be a generator witness
chalk doesn't really care either way
and neither do most part of the code
anyway it's a (relatively) minor detail
that is clearer, thank you :)
Changing all Apply
's to QuantifiedApply
probably is a bad option, right?
yes, bad option
for one thing unifiying applied types is a lot more complex
basically in general the code does care if there are binders
so I think it should be a different variant
It feels like we're suddenly a lot closer to rustc-chalk integration, this was a very productive discussion :slight_smile:
(though I suppose it was in niko's head all along)
good to get it out
I'll try to take notes on the next few steps that I've been pondering after this too ...:)