Stream: wg-traits

Topic: Trait object upcast


dhardy (Oct 08 2018 at 18:09, on Zulip):

Hello people!

I've been looking at https://github.com/rust-lang/rfcs/issues/2035 and trying to work out the simplest bit: trait-object upcast (single inheritance). I'm not familiar with the compiler, so that's the first hurdle...

Alexander Regueiro (Nov 13 2018 at 18:01, on Zulip):

hi @dhardy. have you gotten anywhere with this yet?

dhardy (Nov 14 2018 at 08:06, on Zulip):

Hi @Alexander Regueiro. Unfortunately not; I spent a couple of hours trying to understand the relevant bits of the compiler but didn't get anywhere.

Alexander Regueiro (Nov 14 2018 at 15:24, on Zulip):

Yeah, it's pretty complex. I believe it's been discussed however, so perhaps @nikomatsakis (or @scalexm) have some ideas.

Alexander Regueiro (Nov 14 2018 at 15:24, on Zulip):

I'd like to help you on it if possbile.

Alexander Regueiro (Nov 14 2018 at 15:24, on Zulip):

Haven't looked into it properly yet though.

nikomatsakis (Nov 19 2018 at 19:11, on Zulip):

@Alexander Regueiro if we could at least make "simple cases" work here it seems like it'd be a huge win. I was thinking about it this morning, maybe I can leave some notes somewhere

Alexander Regueiro (Nov 19 2018 at 20:06, on Zulip):

Yeah, that would be cool. Leave some notes in the RFC repo you created a while ago (I have access to it too)?

dhardy (Nov 20 2018 at 08:44, on Zulip):

Simple upcast would be a huge improvement. No more need to have as_any(&self) -> &Any methods (and mut versions), so it's useful for downcasting as well as up.

Alexander Regueiro (Nov 20 2018 at 16:03, on Zulip):

@dhardy with downcast... how so?

dhardy (Nov 20 2018 at 17:26, on Zulip):

For trait T: Any, implementing the &T → &Any cast allows obj.downcast_ref::<X>() for obj: &T. To do this now, we need to add methods like fn as_any(&self) -> &Any to trait T, then call obj.as_any().downcast_ref::<X>().

But this is just a potential application, nothing to do with the implementation. Would love to have this feature.

Alexander Regueiro (Nov 20 2018 at 19:08, on Zulip):

@dhardy I don't fully get how Any works internally (some sort of compiler built-in to get the TypeId?) . I haven't encountered as_any methods before either. How would it look with upcasting?

dhardy (Nov 20 2018 at 19:17, on Zulip):

Yes, it's just a shim around TypeId and reference casting.

Alexander Regueiro (Nov 20 2018 at 19:33, on Zulip):

ah right.

Alexander Regueiro (Nov 20 2018 at 19:34, on Zulip):

so how is the information about the type stored at runtime?

dhardy (Nov 20 2018 at 19:34, on Zulip):

You know you can click on the [src] link by implementations right? Very convenient way to find out how things work.

Alexander Regueiro (Nov 20 2018 at 19:34, on Zulip):

@dhardy Err no. Haha. Since when has that existed? I feel like a fool now.

dhardy (Nov 20 2018 at 19:35, on Zulip):

Don't know; I think it's fairly new.

dhardy (Nov 20 2018 at 19:35, on Zulip):

Regarding your other question: I don't know; it's the magic of TypeId... probably some data stored in the vtable.

Alexander Regueiro (Nov 20 2018 at 19:36, on Zulip):

makes sense

dhardy (Nov 20 2018 at 19:36, on Zulip):

It's an intrinsic so the API docs don't tell me anything: https://doc.rust-lang.org/std/intrinsics/fn.type_id.html

Alexander Regueiro (Nov 20 2018 at 19:36, on Zulip):

indeed

dhardy (Nov 20 2018 at 19:37, on Zulip):

The first step here is to be able to convert a &A to &B where A: B.

Alexander Regueiro (Nov 20 2018 at 19:38, on Zulip):

@dhardy Anyway, this is just a nice bonus, as you say. I'd really like to get upcasting implemented in the compiler (and after that multi-trait objects). Let's have a chat with @nikomatsakis about it soon, I think.

Alexander Regueiro (Nov 20 2018 at 19:38, on Zulip):

yes

dhardy (Nov 20 2018 at 19:38, on Zulip):

I don't know the best way of doing that — maybe the vtable pointer can simply be adjusted, or maybe a pointer can be added to the appropriate &A vtable

dhardy (Nov 20 2018 at 19:39, on Zulip):

Something I'd love to see, but I really don't know where to start in the compiler and have several other things to work on, so don't really want to put in a lot of effort there myself

Alexander Regueiro (Nov 20 2018 at 20:04, on Zulip):

Well, we should discuss it first. Then draft up an RFC, since that still needs to be written. Maybe you could just help with that side of things and leave implementation to someone else.

dhardy (Nov 24 2018 at 17:45, on Zulip):

Can you point me to whichever part of the compiler defines the vtables currently (or the current specification)?

Alexander Regueiro (Nov 24 2018 at 22:57, on Zulip):

How do you mean "define"? Codegen is in src/librustc_codegen_llvm/abi.rs (and possibly callee.rs and meth.rs?). they originate during trait selection of course... and there's a lot of MIR-related code that handles them.

Alexander Regueiro (Nov 24 2018 at 22:57, on Zulip):

I'm not an expert however.

dhardy (Dec 03 2018 at 10:59, on Zulip):

I tried having a look at this yesterday but didn't get anywhere. Can you perhaps summarise what the current vtables for trait objects look like?

Alexander Regueiro (Dec 03 2018 at 16:40, on Zulip):

@dhardy I think @nikomatsakis will do a better job than me at this. Will you be around in a few hours? There's a Traits WG meeting, so you can ask him then. :-)

dhardy (Dec 03 2018 at 17:45, on Zulip):

Well, the question stands :-)

The first part is, for traits A: B, what is the minimum we have to do to convert a trait object for A to one for B? Can the vtable for A be a valid vtable for B?

The second bit is which mechanism do we use to do the conversion? I think @nikomatsakis mentioned this could be a coercion.

The third part is how does this generalise to multi-trait objects? I don't see why we couldn't start only supporting the easier cases (e.g. allowing A+B to reduce to A and A+B+C to A+B via compatible prefixes). There are some ideas for more general support in the issue linked at the top of this thread.

Alexander Regueiro (Dec 03 2018 at 22:26, on Zulip):

@dhardy Fair questions. I wish I could help more... sadly (for us) it seems like @nikomatsakis has a busy week planned, but maybe he'll be able to give this a little time.

Alexander Regueiro (Dec 03 2018 at 22:26, on Zulip):

Also, what do you mean "compatible prefixes"?

dhardy (Dec 04 2018 at 10:56, on Zulip):

As it happens, I also have a busy week. I mean, is the first part of the vtable for A+B+C also a valid vtable for A+B? Hopefully we can construct them that way, which makes many upcasts trivial.

Alexander Regueiro (Dec 04 2018 at 16:43, on Zulip):

@dhardy Well, in general you can't guarantee ordering, whatever that ordering is. Since you may want to go from A + B + C -> A + C (unless you're considering that a "non-simple" case?)

Alexander Regueiro (Dec 04 2018 at 16:43, on Zulip):

it's akin to marginalisation (in statistics)

Alexander Regueiro (Dec 04 2018 at 16:44, on Zulip):

I suspect you can reuse such tables even in the general case.

dhardy (Dec 04 2018 at 17:20, on Zulip):

I see... if we can't guarantee ordering of traits within vtables then we can't offer support for A+B+C → A+B without also supporting A+B+C→A+C. And I guess we can't guarantee ordering because we want Box<A+B> to be equivalent to Box<B+A>.

Can we support A+B+C → B by simply adjusting the vtable pointer? I guess since we don't have multi-trait-objects yet this is more a design pointer for that.

What I would like to know is exactly what information is included in vtables: obviously function pointers and presumably also some unique identifier to make TypeId work. Maybe also the data size?

If the vtable for A+B+C is simply the three vtables concatenated together, repeating any common data like the typeid, then A+B+C → A conversions and some of the A+B+C → A+B type conversions (generalising) are simply pointer offsets, known statically. The other conversions need some other mechanism, e.g. a pointer to the A+C vtable or a statically-compiled map from each A+B+C vtable to the corresponding A+C vtable.

By marginalisation are you talking about optimising the layout of A+B+C depending on usage? I guess that's possible but presumably would have to be a link-time optimisation.

Alexander Regueiro (Dec 04 2018 at 17:27, on Zulip):

@dhardy well, there's this annoying concept of a "principal trait" in trait objects right now, which gets in the way

Alexander Regueiro (Dec 04 2018 at 17:27, on Zulip):

I'm not sure what to do about that.

dhardy (Dec 04 2018 at 17:28, on Zulip):

What is it?

Alexander Regueiro (Dec 04 2018 at 17:28, on Zulip):

it's the "main" trait for a trait object

Alexander Regueiro (Dec 04 2018 at 17:28, on Zulip):

the first one, basically

Alexander Regueiro (Dec 04 2018 at 17:28, on Zulip):

it must be non-auto unless all the traits are auto

dhardy (Dec 04 2018 at 17:28, on Zulip):

Because right now secondary ones are only bounds like Send with no functions?

Alexander Regueiro (Dec 04 2018 at 17:28, on Zulip):

yes, they don't alter the vtable whatsoever

Alexander Regueiro (Dec 04 2018 at 17:28, on Zulip):

because they are empty

Alexander Regueiro (Dec 04 2018 at 17:29, on Zulip):

all auto traits are, of course

dhardy (Dec 04 2018 at 17:29, on Zulip):

So what would e.g. Debug+SendSend mean?

dhardy (Dec 04 2018 at 17:29, on Zulip):

Nothing I guess since you cannot do anything with a Box<Send>

Alexander Regueiro (Dec 04 2018 at 17:29, on Zulip):

note that A + B + C isn't a sum of the individual vtables, it's more akin to a product.

Alexander Regueiro (Dec 04 2018 at 17:30, on Zulip):

or a "joint distribution", if we're to retain the statistical analogy

Alexander Regueiro (Dec 04 2018 at 17:31, on Zulip):

well, Box<Send> is still permitted...

dhardy (Dec 04 2018 at 17:32, on Zulip):

The cast does not appear to be permitted currently: https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=fec38804c19fd39354393f33aab7c200

Alexander Regueiro (Dec 04 2018 at 17:32, on Zulip):

indeed, it's an upcast, which aren't permitted at all right now

Alexander Regueiro (Dec 04 2018 at 17:32, on Zulip):

that would just mean not caring about the Debug "column" in the vtable

dhardy (Dec 04 2018 at 17:32, on Zulip):

Well, &(Debug+Send)&Debug is already permitted

Alexander Regueiro (Dec 04 2018 at 17:33, on Zulip):

sure, but I mean of objects themselves, not references.

dhardy (Dec 04 2018 at 17:33, on Zulip):

I guess it's just a special case of upcast which drops only auto traits

dhardy (Dec 04 2018 at 17:33, on Zulip):

A trait object is always reference

Alexander Regueiro (Dec 04 2018 at 17:33, on Zulip):

not really

Alexander Regueiro (Dec 04 2018 at 17:33, on Zulip):

it's a fat pointer, you mean?

dhardy (Dec 04 2018 at 17:34, on Zulip):

Oh, it can be a Box

dhardy (Dec 04 2018 at 17:34, on Zulip):

that's just an owning reference

dhardy (Dec 04 2018 at 17:34, on Zulip):

But yes, always a fat pointer I think?

Alexander Regueiro (Dec 04 2018 at 17:34, on Zulip):

yes. but when one says "reference", it evokes &/& mut in Rust :-)

Alexander Regueiro (Dec 04 2018 at 17:34, on Zulip):

yes, always a fat pointer

dhardy (Dec 04 2018 at 17:34, on Zulip):

You remember the @ references we used to have?

Alexander Regueiro (Dec 04 2018 at 17:35, on Zulip):

only vaguely. I didn't properly take up Rust until 1.0.

Alexander Regueiro (Dec 04 2018 at 17:35, on Zulip):

anyway, I think we could get away with just having a vtable with all potential columns included... e.g. we don't need a separate vtable for A + B, A + C, B + C if we have A + B + C`.

dhardy (Dec 04 2018 at 17:35, on Zulip):

Good thing we lost them I think

Alexander Regueiro (Dec 04 2018 at 17:35, on Zulip):

yep, for sure

dhardy (Dec 04 2018 at 17:35, on Zulip):

Do we not need separate vtables for e.g. A+C?

dhardy (Dec 04 2018 at 17:36, on Zulip):

How would that work?

dhardy (Dec 04 2018 at 17:36, on Zulip):

Very fat pointers (i.e. a separate poniter to each vtable) maybe?

Alexander Regueiro (Dec 04 2018 at 17:43, on Zulip):

I think we can get away without them. That prevents an exponential explosion of vtables too.

Alexander Regueiro (Dec 04 2018 at 17:43, on Zulip):

(in terms of number of tables)

dhardy (Dec 04 2018 at 17:44, on Zulip):

Do we really get an exponential explosion though? Obviously we need A+B+C if this is used, but I don't think we need to generate A+C unless it's actually used.

dhardy (Dec 04 2018 at 17:45, on Zulip):

Well, this might require generating vtables and conversions at link time, so may not be simple to avoid.

dhardy (Dec 04 2018 at 17:46, on Zulip):

Anyway, how do you propose to avoid them?

Alexander Regueiro (Dec 04 2018 at 17:49, on Zulip):

we get an exponential explosion if they're all used

dhardy (Dec 04 2018 at 17:49, on Zulip):

That also requires the user to write a lot of code. So I don't see the problem.

Alexander Regueiro (Dec 04 2018 at 17:49, on Zulip):

it's not necessarily

Alexander Regueiro (Dec 04 2018 at 17:49, on Zulip):

just an observation

Alexander Regueiro (Dec 04 2018 at 17:50, on Zulip):

and actually, I've been thinking about this wrong

Alexander Regueiro (Dec 04 2018 at 17:50, on Zulip):

we do need to generate combintions

Alexander Regueiro (Dec 04 2018 at 17:50, on Zulip):

so yeah, I don't think there's avoiding it without some really fancy lookup scheme

Alexander Regueiro (Dec 04 2018 at 17:51, on Zulip):

which probably adds runtime overhead

Alexander Regueiro (Dec 04 2018 at 17:51, on Zulip):

undesirable, of course

dhardy (Dec 04 2018 at 17:51, on Zulip):

I don't think it's hard actually.

Alexander Regueiro (Dec 04 2018 at 17:51, on Zulip):

in theory, no

Alexander Regueiro (Dec 04 2018 at 17:51, on Zulip):

in practice... may be some gotchas

Alexander Regueiro (Dec 04 2018 at 17:51, on Zulip):

I'm not sure

dhardy (Dec 04 2018 at 17:51, on Zulip):

The compiler can represent A+B+C+D as a collection of traits, and not generate the vtables until link time

dhardy (Dec 04 2018 at 17:51, on Zulip):

That way it only needs to generate A+B+D vtables if used

Alexander Regueiro (Dec 04 2018 at 17:52, on Zulip):

yes

Alexander Regueiro (Dec 04 2018 at 17:52, on Zulip):

that makes sense to me

dhardy (Dec 04 2018 at 17:52, on Zulip):

If A+B+C+D objects are made, then obviously that vtable is needed

Alexander Regueiro (Dec 04 2018 at 17:52, on Zulip):

I think you need to wait until link-time regardless, because of cross-crate scenarios

dhardy (Dec 04 2018 at 17:52, on Zulip):

The compiler doesn't know yet whether a cast to B+D will be needed

dhardy (Dec 04 2018 at 17:53, on Zulip):

So what you do is wait until link time, then if a A+B+C+DB+D cast is needed, generate a function taking the vtable for the former and returning the latter

dhardy (Dec 04 2018 at 17:53, on Zulip):

i.e. static table lookup

dhardy (Dec 04 2018 at 17:54, on Zulip):

Since the vtable for A+B+C+D is only created (for each object) at link-time, the B+D vtable and conversion can be created at the same time

dhardy (Dec 04 2018 at 17:56, on Zulip):

This is assuming each vtable is at a fixed address — I guess the address may not be known until the problem starts, so this might mean populating the conversion tables at the program start; less than ideal but not a real issue since we only have conversions for the types used at this point

dhardy (Dec 04 2018 at 18:00, on Zulip):

BTW I'd still like to know exactly what data we need in the vtable if you can find out?

Alexander Regueiro (Dec 04 2018 at 18:04, on Zulip):

sure, let me speak to people

Alexander Regueiro (Dec 04 2018 at 18:12, on Zulip):

https://internals.rust-lang.org/t/wheres-the-catch-with-box-read-write/6617

Alexander Regueiro (Dec 04 2018 at 18:12, on Zulip):

have you read that yet?

Alexander Regueiro (Dec 13 2018 at 21:14, on Zulip):

CC @nikomatsakis, this is what I was talking about earlier. :-)

Alexander Regueiro (Dec 13 2018 at 21:14, on Zulip):

@dhardy and I could definitely use your input/advice on some of the above.

Last update: Nov 18 2019 at 01:25UTC