I've been looking at https://github.com/rust-lang/rfcs/issues/2035 and trying to work out the simplest bit: trait-object upcast (single inheritance). I'm not familiar with the compiler, so that's the first hurdle...
hi @dhardy. have you gotten anywhere with this yet?
Hi @Alexander Regueiro. Unfortunately not; I spent a couple of hours trying to understand the relevant bits of the compiler but didn't get anywhere.
Yeah, it's pretty complex. I believe it's been discussed however, so perhaps @nikomatsakis (or @scalexm) have some ideas.
I'd like to help you on it if possbile.
Haven't looked into it properly yet though.
@Alexander Regueiro if we could at least make "simple cases" work here it seems like it'd be a huge win. I was thinking about it this morning, maybe I can leave some notes somewhere
Yeah, that would be cool. Leave some notes in the RFC repo you created a while ago (I have access to it too)?
Simple upcast would be a huge improvement. No more need to have
as_any(&self) -> &Any methods (and
mut versions), so it's useful for downcasting as well as up.
@dhardy with downcast... how so?
trait T: Any, implementing the
&T → &Any cast allows
obj: &T. To do this now, we need to add methods like
fn as_any(&self) -> &Any to trait
T, then call
But this is just a potential application, nothing to do with the implementation. Would love to have this feature.
@dhardy I don't fully get how
Any works internally (some sort of compiler built-in to get the
TypeId?) . I haven't encountered
as_any methods before either. How would it look with upcasting?
Yes, it's just a shim around
TypeId and reference casting.
so how is the information about the type stored at runtime?
You know you can click on the
[src] link by implementations right? Very convenient way to find out how things work.
@dhardy Err no. Haha. Since when has that existed? I feel like a fool now.
Don't know; I think it's fairly new.
Regarding your other question: I don't know; it's the magic of
TypeId... probably some data stored in the vtable.
It's an intrinsic so the API docs don't tell me anything: https://doc.rust-lang.org/std/intrinsics/fn.type_id.html
The first step here is to be able to convert a
@dhardy Anyway, this is just a nice bonus, as you say. I'd really like to get upcasting implemented in the compiler (and after that multi-trait objects). Let's have a chat with @nikomatsakis about it soon, I think.
I don't know the best way of doing that — maybe the vtable pointer can simply be adjusted, or maybe a pointer can be added to the appropriate
Something I'd love to see, but I really don't know where to start in the compiler and have several other things to work on, so don't really want to put in a lot of effort there myself
Well, we should discuss it first. Then draft up an RFC, since that still needs to be written. Maybe you could just help with that side of things and leave implementation to someone else.
Can you point me to whichever part of the compiler defines the vtables currently (or the current specification)?
How do you mean "define"? Codegen is in
src/librustc_codegen_llvm/abi.rs (and possibly
meth.rs?). they originate during trait selection of course... and there's a lot of MIR-related code that handles them.
I'm not an expert however.
I tried having a look at this yesterday but didn't get anywhere. Can you perhaps summarise what the current vtables for trait objects look like?
@dhardy I think @nikomatsakis will do a better job than me at this. Will you be around in a few hours? There's a Traits WG meeting, so you can ask him then. :-)
Well, the question stands :-)
The first part is, for traits
A: B, what is the minimum we have to do to convert a trait object for
A to one for
B? Can the vtable for
A be a valid vtable for
The second bit is which mechanism do we use to do the conversion? I think @nikomatsakis mentioned this could be a coercion.
The third part is how does this generalise to multi-trait objects? I don't see why we couldn't start only supporting the easier cases (e.g. allowing
A+B to reduce to
A+B via compatible prefixes). There are some ideas for more general support in the issue linked at the top of this thread.
@dhardy Fair questions. I wish I could help more... sadly (for us) it seems like @nikomatsakis has a busy week planned, but maybe he'll be able to give this a little time.
Also, what do you mean "compatible prefixes"?
As it happens, I also have a busy week. I mean, is the first part of the vtable for
A+B+C also a valid vtable for
A+B? Hopefully we can construct them that way, which makes many upcasts trivial.
@dhardy Well, in general you can't guarantee ordering, whatever that ordering is. Since you may want to go from
A + B + C ->
A + C (unless you're considering that a "non-simple" case?)
it's akin to marginalisation (in statistics)
I suspect you can reuse such tables even in the general case.
I see... if we can't guarantee ordering of traits within vtables then we can't offer support for
A+B+C → A+B without also supporting
A+B+C→A+C. And I guess we can't guarantee ordering because we want
Box<A+B> to be equivalent to
Can we support
A+B+C → B by simply adjusting the vtable pointer? I guess since we don't have multi-trait-objects yet this is more a design pointer for that.
What I would like to know is exactly what information is included in vtables: obviously function pointers and presumably also some unique identifier to make
TypeId work. Maybe also the data size?
If the vtable for
A+B+C is simply the three vtables concatenated together, repeating any common data like the typeid, then
A+B+C → A conversions and some of the
A+B+C → A+B type conversions (generalising) are simply pointer offsets, known statically. The other conversions need some other mechanism, e.g. a pointer to the
A+C vtable or a statically-compiled map from each
A+B+C vtable to the corresponding
By marginalisation are you talking about optimising the layout of
A+B+C depending on usage? I guess that's possible but presumably would have to be a link-time optimisation.
@dhardy well, there's this annoying concept of a "principal trait" in trait objects right now, which gets in the way
I'm not sure what to do about that.
What is it?
it's the "main" trait for a trait object
the first one, basically
it must be non-auto unless all the traits are auto
Because right now secondary ones are only bounds like
Send with no functions?
yes, they don't alter the vtable whatsoever
because they are empty
all auto traits are, of course
So what would e.g.
Nothing I guess since you cannot do anything with a
A + B + C isn't a sum of the individual vtables, it's more akin to a product.
or a "joint distribution", if we're to retain the statistical analogy
Box<Send> is still permitted...
The cast does not appear to be permitted currently: https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=fec38804c19fd39354393f33aab7c200
indeed, it's an upcast, which aren't permitted at all right now
that would just mean not caring about the
Debug "column" in the vtable
&Debug is already permitted
sure, but I mean of objects themselves, not references.
I guess it's just a special case of upcast which drops only auto traits
A trait object is always reference
it's a fat pointer, you mean?
Oh, it can be a
that's just an owning reference
But yes, always a fat pointer I think?
yes. but when one says "reference", it evokes
& mut in Rust :-)
yes, always a fat pointer
You remember the
@ references we used to have?
only vaguely. I didn't properly take up Rust until 1.0.
anyway, I think we could get away with just having a vtable with all potential columns included... e.g. we don't need a separate vtable for
A + B, A + C
, B + C
if we have A + B + C`.
Good thing we lost them I think
yep, for sure
Do we not need separate vtables for e.g.
How would that work?
Very fat pointers (i.e. a separate poniter to each vtable) maybe?
I think we can get away without them. That prevents an exponential explosion of vtables too.
(in terms of number of tables)
Do we really get an exponential explosion though? Obviously we need
A+B+C if this is used, but I don't think we need to generate
A+C unless it's actually used.
Well, this might require generating vtables and conversions at link time, so may not be simple to avoid.
Anyway, how do you propose to avoid them?
we get an exponential explosion if they're all used
That also requires the user to write a lot of code. So I don't see the problem.
it's not necessarily
just an observation
and actually, I've been thinking about this wrong
we do need to generate combintions
so yeah, I don't think there's avoiding it without some really fancy lookup scheme
which probably adds runtime overhead
undesirable, of course
I don't think it's hard actually.
in theory, no
in practice... may be some gotchas
I'm not sure
The compiler can represent
A+B+C+D as a collection of traits, and not generate the vtables until link time
That way it only needs to generate
A+B+D vtables if used
that makes sense to me
A+B+C+D objects are made, then obviously that vtable is needed
I think you need to wait until link-time regardless, because of cross-crate scenarios
The compiler doesn't know yet whether a cast to
B+D will be needed
So what you do is wait until link time, then if a
B+D cast is needed, generate a function taking the vtable for the former and returning the latter
i.e. static table lookup
Since the vtable for
A+B+C+D is only created (for each object) at link-time, the
B+D vtable and conversion can be created at the same time
This is assuming each vtable is at a fixed address — I guess the address may not be known until the problem starts, so this might mean populating the conversion tables at the program start; less than ideal but not a real issue since we only have conversions for the types used at this point
BTW I'd still like to know exactly what data we need in the vtable if you can find out?
sure, let me speak to people
have you read that yet?
CC @nikomatsakis, this is what I was talking about earlier. :-)
@dhardy and I could definitely use your input/advice on some of the above.