Stream: t-compiler

Topic: dylib linking #64872


pnkfelix (Oct 23 2019 at 09:38, on Zulip):

Hey @mw maybe we can chat here rather than via github

mw (Oct 23 2019 at 09:40, on Zulip):

sure

mw (Oct 23 2019 at 09:43, on Zulip):

to confirm: there is no rlib version of libstd anywhere here? (e.g. no left-over file from a previous compilation or something)

pnkfelix (Oct 23 2019 at 09:45, on Zulip):

hmm I thought I did cargo clean first but I'll check

pnkfelix (Oct 23 2019 at 09:48, on Zulip):

after cargo clean && cargo build, we have: .d files for compiler_builtins, core, getopts, std, and test. We also have rlib and rmeta files for libcompiler_builtins, libcore, libgetopts. And we have libstd.dylib.

mw (Oct 23 2019 at 09:48, on Zulip):

ok

pnkfelix (Oct 23 2019 at 09:48, on Zulip):

the minification builds very fast, by the way

mw (Oct 23 2019 at 09:48, on Zulip):

https://dxr.mozilla.org/rust/source/src/librustc_mir/monomorphize/collector.rs#833

pnkfelix (Oct 23 2019 at 09:48, on Zulip):

so if you have a computer handy, you may want to go ahead and try it locally

mw (Oct 23 2019 at 09:49, on Zulip):

this is where rustc checks if there is an upstream impl available

pnkfelix (Oct 23 2019 at 09:49, on Zulip):

(by "builds", of course mean "gets to the final linker failure)

pnkfelix (Oct 23 2019 at 09:49, on Zulip):

this is where rustc checks if there is an upstream impl available

Okay I'll investigate that

mw (Oct 23 2019 at 09:51, on Zulip):

and this is where stuff from dylibs should get filtered out: https://dxr.mozilla.org/rust/source/src/librustc_metadata/cstore_impl.rs#247

pnkfelix (Oct 23 2019 at 09:53, on Zulip):

too bad there's not more debug! instrumentation in those files. :)

pnkfelix (Oct 23 2019 at 09:53, on Zulip):

but I can add it readily enough

mw (Oct 23 2019 at 09:55, on Zulip):

__ZN4core6Object6method17h3c307b2b614e132dE probably comes from libcore?

mw (Oct 23 2019 at 09:55, on Zulip):

i.e. it is already present in libcore.rlib which then gets linked into libstd.dylib?

pnkfelix (Oct 23 2019 at 09:57, on Zulip):

let me see

pnkfelix (Oct 23 2019 at 09:58, on Zulip):

nm -m libcore.rlib says:

...
0000000000000000 (__TEXT,__text) external __ZN4core6Object6method17hacde9ba387d5faafE
...
pnkfelix (Oct 23 2019 at 09:58, on Zulip):

so yes that sounds right

pnkfelix (Oct 23 2019 at 09:58, on Zulip):

is compiling the downstream crate going to transitively look at libcore.rlib?

pnkfelix (Oct 23 2019 at 09:59, on Zulip):

even if the downstream crate only references libstd, not libcore?

mw (Oct 23 2019 at 10:00, on Zulip):

(also, try compiling with -Z symbol_mangling_version=v0 -- then generic args will show up in symbol names)

mw (Oct 23 2019 at 10:01, on Zulip):

is the downstream crate referencing anything from libcore?

pnkfelix (Oct 23 2019 at 10:02, on Zulip):

well, there's kind of a diamond situation

pnkfelix (Oct 23 2019 at 10:02, on Zulip):

libtest has two upstream deps, getopts and std2

pnkfelix (Oct 23 2019 at 10:03, on Zulip):

and then just does extern crate getopts; (and nothing else) in its lib.rs

pnkfelix (Oct 23 2019 at 10:04, on Zulip):

getopts references std::Object

pnkfelix (Oct 23 2019 at 10:04, on Zulip):

(sorry for the potential confusion; at some point I renamed the local std to std2)

mw (Oct 23 2019 at 10:05, on Zulip):

and it fails when trying to link libtest, right?

pnkfelix (Oct 23 2019 at 10:05, on Zulip):

anyway, the point is, libtest does not reference anything from libcore itself. getopts is indirectly referencing trait Object, which is defined in libcore but re-exported from libstd

pnkfelix (Oct 23 2019 at 10:05, on Zulip):

that's right, it fails when trying to link libtest.

mw (Oct 23 2019 at 10:07, on Zulip):

/me has to go get lunch now

pnkfelix (Oct 23 2019 at 10:08, on Zulip):

k. talk later.

mw (Oct 23 2019 at 10:08, on Zulip):

I'm still not entirely clear on what the situation is exactly. I'll take another look later, actually trying to compile the repro locally

pnkfelix (Oct 23 2019 at 10:08, on Zulip):

did it not work out of the box for you?

pnkfelix (Oct 23 2019 at 10:08, on Zulip):

(it might not repro on linux)

mw (Oct 23 2019 at 10:08, on Zulip):

I did not try

pnkfelix (Oct 23 2019 at 10:08, on Zulip):

oh okay

mw (Oct 23 2019 at 10:09, on Zulip):

I only have linux available. we'll see

pnkfelix (Oct 23 2019 at 10:12, on Zulip):

whoa! removing -C debuginfo=2 made the link succeed (?!)

pnkfelix (Oct 23 2019 at 10:12, on Zulip):

I did not expect that...

pnkfelix (Oct 23 2019 at 10:13, on Zulip):

namely, removing it from the compilation of getopts make the link succeed.

pnkfelix (Oct 23 2019 at 11:08, on Zulip):

hmm. I guess we must look at the libcore.rlib and/or libcore.rmeta even after we've compiled libstd to a dylib...?

mw (Oct 23 2019 at 11:17, on Zulip):

Yes, Rust dylibs don't duplicate the crate metadata of rlibs they include (I think)

pnkfelix (Oct 23 2019 at 11:19, on Zulip):

hmm

pnkfelix (Oct 23 2019 at 11:19, on Zulip):

so reviewing things, I don't know how to "fix" this as is

pnkfelix (Oct 23 2019 at 11:20, on Zulip):

We compile libcore as an rlib (only)

pnkfelix (Oct 23 2019 at 11:20, on Zulip):

and then later we compile libstd as a dylib

pnkfelix (Oct 23 2019 at 11:20, on Zulip):

but we've already a decision to export the monomorphization from libcore

pnkfelix (Oct 23 2019 at 11:20, on Zulip):

and the downstream crates see that export in the metadata

pnkfelix (Oct 23 2019 at 11:21, on Zulip):

But its possible the "right" fix here is to change the user's code, so that they start compiling libcore with crate-type = ["dylib", "rlib"]

pnkfelix (Oct 23 2019 at 11:21, on Zulip):

is that what they should have been doing from the outset anyway?

pnkfelix (Oct 23 2019 at 11:22, on Zulip):

Or are you supposed to be able to create crate A as an rlib, and then have crate B wrap it up as a dylib, and downstream crates C, D, etc that only see B should all be unaware that crate A was compiled as an rlib?

pnkfelix (Oct 23 2019 at 11:24, on Zulip):

because if you're supposed to be able to have that setup (with A:rlib, B:dylib, and no one cares), then it seems like we have to have B revise the crate metadata for what it took from A so that it doesn't re-export the generics.

mw (Oct 23 2019 at 11:25, on Zulip):

it should work

mw (Oct 23 2019 at 11:26, on Zulip):

https://dxr.mozilla.org/rust/source/src/librustc_metadata/cstore_impl.rs#247 this piece of code is supposed to filter monomorphizations out in the crate that would use them

mw (Oct 23 2019 at 11:26, on Zulip):

and symbol name disambiguation should take care of there being no symbol conflicts

pnkfelix (Oct 23 2019 at 11:27, on Zulip):

ah sorry, you linked that before and I didn't include it in my instrumentation

mw (Oct 23 2019 at 11:28, on Zulip):

the error does not seem to occur on linux, btw :/

pnkfelix (Oct 23 2019 at 11:29, on Zulip):

right, I think the original bug filer noted the same

pnkfelix (Oct 23 2019 at 11:33, on Zulip):

ugh rustc_metadata doesn't have debug! logging. :(

pnkfelix (Oct 23 2019 at 11:34, on Zulip):

... or... I can't use in the provide! macro...?

pnkfelix (Oct 23 2019 at 11:36, on Zulip):

or, heh, i have to import it.

pnkfelix (Oct 23 2019 at 11:37, on Zulip):

/me knows how to program in Rust, really

mw (Oct 23 2019 at 11:39, on Zulip):

on linux, the symbol is question is listed as undefined by nm for libtest.so

mw (Oct 23 2019 at 11:40, on Zulip):

but the linker just doesn't seem to care there?

mw (Oct 23 2019 at 11:40, on Zulip):

because it's also unused?

pnkfelix (Oct 23 2019 at 11:40, on Zulip):

hmm

pnkfelix (Oct 23 2019 at 11:43, on Zulip):

https://dxr.mozilla.org/rust/source/src/librustc_metadata/cstore_impl.rs#247 this piece of code is supposed to filter monomorphizations out in the crate that would use them

in the above statement, which crate do you expect to do the filtering here? getopts?

pnkfelix (Oct 23 2019 at 11:45, on Zulip):

in my instrumentation, remove_generics always ends up false (for both getopts and for test)

pnkfelix (Oct 23 2019 at 11:46, on Zulip):

for getopts, the linkage is always None. For test, the linkage is always Some(Static). (and for std, the linkage is always Some(Static))

mw (Oct 23 2019 at 11:46, on Zulip):

libgetopts

mw (Oct 23 2019 at 11:49, on Zulip):

so, when compiling getopts then remove_generics should be true for libcore and libstd

pnkfelix (Oct 23 2019 at 11:49, on Zulip):

okay. I can dig more into why its not

pnkfelix (Oct 23 2019 at 11:59, on Zulip):

hmm. we are passing -C prefer-dynamic to compile libstd...

pnkfelix (Oct 23 2019 at 11:59, on Zulip):

and that affects the dependency_formats computation

pnkfelix (Oct 23 2019 at 12:00, on Zulip):

namely here: https://dxr.mozilla.org/rust/source/src/librustc_metadata/dependency_format.rs#85

pnkfelix (Oct 23 2019 at 12:01, on Zulip):

but for getopts we are not passing -C prefer-dynamic

pnkfelix (Oct 23 2019 at 12:02, on Zulip):

Nope I'm not going to be able to just walk through this code

mw (Oct 23 2019 at 12:02, on Zulip):

you see why I want to get rid of Rust dylibs :P

pnkfelix (Oct 23 2019 at 12:02, on Zulip):

I'll have to instrument depedency_format::calculate_type to figure out what's going on there

mw (Oct 23 2019 at 12:02, on Zulip):

I think only @Alex Crichton really understands dependency formats

mw (Oct 23 2019 at 12:05, on Zulip):

but for getopts we are not passing -C prefer-dynamic

you might be on to something here. @Alex Crichton might be better able to help you.

pnkfelix (Oct 23 2019 at 12:05, on Zulip):

yeah. I figure I'll get a little more info about what's happening whtin dependency_format

pnkfelix (Oct 23 2019 at 12:05, on Zulip):

(just adding -C prefer-dynamic to the getopts compilation didn't fix anything, though)

pnkfelix (Oct 23 2019 at 12:36, on Zulip):

hey @mw , in https://dxr.mozilla.org/rust/source/src/librustc_metadata/cstore_impl.rs#247 , it currently sets remove_generics to true only if it sees linkage = Some(IncludedFromDylib) or linkage = Some(Dynamic). But what is linkage = None supposed to denote there?

pnkfelix (Oct 23 2019 at 12:37, on Zulip):

I guess you already pointed out that only @Alex Crichton understands dependency formats

pnkfelix (Oct 23 2019 at 12:39, on Zulip):

(anyway I would think that if its None, we shouldn't be assuming we can load an upstream dep from it)

mw (Oct 23 2019 at 12:40, on Zulip):

yes, I think this is a matrix of crate-nums

mw (Oct 23 2019 at 12:40, on Zulip):

and None means that that crates just does not dependent on that other crate

pnkfelix (Oct 23 2019 at 12:41, on Zulip):

and if that's the case, then there shouldn't be anything to remove in the first place?

pnkfelix (Oct 23 2019 at 12:42, on Zulip):

(or rather, there shouldn't be any imports, and therefore it shouldn't matter if we remove the generics or no)

mw (Oct 23 2019 at 12:42, on Zulip):

yeah, the matrix thing actually doesn;'t make sense, does it?

mw (Oct 23 2019 at 12:42, on Zulip):

maybe crate nums would have to be translated?

mw (Oct 23 2019 at 12:44, on Zulip):

I'm starting to think more and more that that code does not do what I thought it did

mw (Oct 23 2019 at 12:45, on Zulip):

what I thought it did was: if the upstream crate in question is a dylib or an rlib that comes packaged as part of a dylib, then remove the generics

pnkfelix (Oct 23 2019 at 12:49, on Zulip):

and None means that that crates just does not dependent on that other crate

in the above, do you mean "not immediately dependent" ?

pnkfelix (Oct 23 2019 at 12:49, on Zulip):

e.g. getopts says it depends just on libstd

pnkfelix (Oct 23 2019 at 12:50, on Zulip):

and so it only has a single crate in its tcx.sess.crate_types (which I believe corresponds to libstd)

pnkfelix (Oct 23 2019 at 12:50, on Zulip):

but you still have DefId's running around for other crates like core

pnkfelix (Oct 23 2019 at 12:50, on Zulip):

So there can be an indirect dependency

pnkfelix (Oct 23 2019 at 12:50, on Zulip):

right?

mw (Oct 23 2019 at 12:51, on Zulip):

yeah, and shouldn't it also depend on std2 in the example?

pnkfelix (Oct 23 2019 at 12:51, on Zulip):

oh, sorry, std2 is the std

pnkfelix (Oct 23 2019 at 12:51, on Zulip):

I think that supposed alpha-rename in my code is causing more harm than good in our discussion. :)

pnkfelix (Oct 23 2019 at 12:52, on Zulip):
[dependencies.std2]
path = "../libstd"
pnkfelix (Oct 23 2019 at 13:01, on Zulip):

Okay, here's an idea: collector::should_monomorphize_locally should, in its determination of whether it can link to something upstream, incorporate a similar computation to what cstore_impl::exported_symbols is doing. Right?

pnkfelix (Oct 23 2019 at 13:02, on Zulip):

that is, we should be trying to use the same logic when we tag a DefId as being something we can reliably link to from upstream

pnkfelix (Oct 23 2019 at 13:04, on Zulip):

that is, from the conversation so far, it sounds like a mismatch in those two computations within the compilation of getopts

pnkfelix (Oct 23 2019 at 13:04, on Zulip):

(as opposed to a mismatch across compiles of two (or more) distinct crates)

pnkfelix (Oct 23 2019 at 13:12, on Zulip):

(and now, reading over the comment thread on PR #64324, I'm relieved to note that it seems like everyone, including @Alex Crichton , finds the code here confusing...)

mw (Oct 23 2019 at 13:39, on Zulip):

should_monomorphize_locally is based on upstream_monomorphization_for which in turn is based on exported_symbols, so the check is already done

pnkfelix (Oct 23 2019 at 13:40, on Zulip):

hmm

mw (Oct 23 2019 at 13:41, on Zulip):

so those are actually already factored in a way as not to duplicate the logic

pnkfelix (Oct 23 2019 at 13:41, on Zulip):

The result is nonetheless inconsistent

mw (Oct 23 2019 at 13:42, on Zulip):

because the filtering in cstore_impl::exported_symbols seems to be broken, somehow

pnkfelix (Oct 23 2019 at 13:42, on Zulip):

Right

pnkfelix (Oct 23 2019 at 13:43, on Zulip):

What do you think of this:

@@ -243,17 +245,32 @@ provide! { <'tcx> tcx, def_id, other, cdata,
         // from this crate.
         let formats = tcx.dependency_formats(LOCAL_CRATE);
         let remove_generics = formats.iter().any(|(_ty, list)| {
-            match list.get(def_id.krate.as_usize() - 1) {
+            let linkage = list.get(def_id.krate.as_usize() - 1);
+            match linkage {
                 Some(Linkage::IncludedFromDylib) | Some(Linkage::Dynamic) => true,
+
+                // rust-lang/rust#64872 if we got `None`, then this
+                // must be an indirect dependency (e.g. a symbol in
+                // `core` that we're accessing through `std`). Best to
+                // assume we cannot load such monomorphization either.
+                None => true,
+
                 _ => false,
              }
         });
pnkfelix (Oct 23 2019 at 13:44, on Zulip):

I'm not thrilled with the fact that I don't really understand the linkage computation.

pnkfelix (Oct 23 2019 at 13:45, on Zulip):

i.e., should dependency_formats instead be returning a list that includes the crates that we depend on transitively?

mw (Oct 23 2019 at 13:47, on Zulip):

yeah, looks like None and Some(NotLinked) are the same

pnkfelix (Oct 23 2019 at 13:48, on Zulip):

... but this doesn't set remove_generics to true for Some(NotLinked) either ...

mw (Oct 23 2019 at 13:48, on Zulip):

which seems wrong too

mw (Oct 23 2019 at 13:48, on Zulip):

but maybe in that case the list is always empty to begin with?

pnkfelix (Oct 23 2019 at 13:48, on Zulip):

you mean how Some(NotLinked) is not handled?

mw (Oct 23 2019 at 13:49, on Zulip):

you mean how Some(NotLinked) is not handled?

yes

pnkfelix (Oct 23 2019 at 13:49, on Zulip):

Maybe.

pnkfelix (Oct 23 2019 at 13:49, on Zulip):

Certainly the list in this case is non-empty.

pnkfelix (Oct 23 2019 at 13:49, on Zulip):

Okay well the diff I showed you above does "fix" this

pnkfelix (Oct 23 2019 at 13:49, on Zulip):

Not sure if its a good fix

mw (Oct 23 2019 at 13:50, on Zulip):

I''ve added some debug output that gives me the following for getopts:

mw (Oct 23 2019 at 13:50, on Zulip):
crate: "core"
 - (NonGeneric(DefId(2:16 ~ core[ecd7]::unused[0])), Rust)
 - (Generic(DefId(2:11 ~ core[ecd7]::Object[0]::method[0]), [&u32]), Rust)
crate: "std"
 - (NonGeneric(DefId(1:23 ~ std[c479]::__rdl_alloc[0])), Rust)
 - (NonGeneric(DefId(1:26 ~ std[c479]::__rdl_alloc_zeroed[0])), Rust)
 - (NonGeneric(DefId(1:25 ~ std[c479]::__rdl_realloc[0])), Rust)
 - (NonGeneric(DefId(1:24 ~ std[c479]::__rdl_dealloc[0])), Rust)
 - (NonGeneric(DefId(1:20 ~ std[c479]::rust_eh_personality[0])), Rust)
crate: "compiler_builtins"
 - (NonGeneric(DefId(3:13 ~ compiler_builtins[6753]::probestack[0]::__rust_probestack[0])), Rust)
mw (Oct 23 2019 at 13:51, on Zulip):

Not sure if its a good fix

Maybe it should be:

            match linkage {
                 Some(Linkage::Static) => false,
                 _ => true,
            }
pnkfelix (Oct 23 2019 at 13:52, on Zulip):

heh

pnkfelix (Oct 23 2019 at 13:52, on Zulip):

I like it.

pnkfelix (Oct 23 2019 at 13:53, on Zulip):

Every other case does indeed sound like something where you shoudn't be attempting to load monomorphizations

mw (Oct 23 2019 at 13:54, on Zulip):

(#64324 already is kind of a work-around)

pnkfelix (Oct 23 2019 at 13:54, on Zulip):

Okay I'll see about putting up a PR with that.

pnkfelix (Oct 23 2019 at 13:54, on Zulip):

hardest part will be making unit test(s) that accurately capture the original problem.

mw (Oct 23 2019 at 13:54, on Zulip):

Yeah, r? Alex on it to be sure :)

pnkfelix (Oct 23 2019 at 13:55, on Zulip):

I guess I could just turn my current reduction into a run-make test

pnkfelix (Oct 23 2019 at 13:56, on Zulip):

or do we already have tests (outside of run-make) that exercise no_core+no_std? I guess I should look for that first.

Alex Crichton (Oct 23 2019 at 14:25, on Zulip):

:wave:

Alex Crichton (Oct 23 2019 at 14:25, on Zulip):

reading some backscroll now @pnkfelix

Alex Crichton (Oct 23 2019 at 14:26, on Zulip):

sounds like y'all got a fix in the meantime though?

Alex Crichton (Oct 23 2019 at 14:26, on Zulip):

happy to help clear things up if anything remains

Alex Crichton (Oct 23 2019 at 14:26, on Zulip):

dependency_format is the bane of my existence

Alex Crichton (Oct 23 2019 at 14:26, on Zulip):

I'm very sad I had to write that code

pnkfelix (Oct 23 2019 at 15:29, on Zulip):

@Alex Crichton what do you think of the hypotheses up above? Especially with respect to the meaning of linkage == None ?

Alex Crichton (Oct 23 2019 at 15:30, on Zulip):

@pnkfelix oh hm I didn't read closely enough to see the possible hypotheses

Alex Crichton (Oct 23 2019 at 15:30, on Zulip):

do you have a link to the message?

pnkfelix (Oct 24 2019 at 08:01, on Zulip):

@Alex Crichton sorry, i missed your follow up Q yesterday. This is the specific thing I was asking about: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/dylib.20linking.20.2364872/near/178848790

Alex Crichton (Oct 24 2019 at 14:54, on Zulip):

@pnkfelix oh ok that sounds reasonable to me as a possible solution

Alex Crichton (Oct 24 2019 at 14:54, on Zulip):

if it passes tests sounds good to me :)

pnkfelix (Oct 30 2019 at 15:36, on Zulip):

hey @Alex Crichton , regarding your comment on the -Z flag: I'm not sure if I'm communicating the situation clearly here

pnkfelix (Oct 30 2019 at 15:36, on Zulip):

My goal at this point is to address the semantic regression

pnkfelix (Oct 30 2019 at 15:37, on Zulip):

If there is a performance regression in terms of lack of sharing, that is a evidence of further work that does indeed need to be done

pnkfelix (Oct 30 2019 at 15:37, on Zulip):

But I do not think we should be blocking PR #65781 based on the question of whether we witness a reduction in shared generics

pnkfelix (Oct 30 2019 at 15:41, on Zulip):

Or maybe let me put it this way: Can you, in your feedback, please address the scenario I described in my github comment on the PR ?

pnkfelix (Oct 30 2019 at 15:47, on Zulip):

(Of course one possible response to all of this is to say "we should not be prioritizing a fix to the linkage problem of rlib <- dylib <- rlib <- dylib over the (as of yet unmeasured) cost of less sharing of generics for rlibs where are currently observing None for the dependency_format calculation")

pnkfelix (Oct 30 2019 at 15:49, on Zulip):

I'm going to nominate this for discussion at the team meeting tomorrow; I think I need more input (unless I find a better fix in the meantime that is also readily backportable)

Alex Crichton (Oct 30 2019 at 16:13, on Zulip):

@pnkfelix oh sorry yeah sure, I'll leave comments on the PR

pnkfelix (Oct 30 2019 at 17:32, on Zulip):

Thanks for the feedback and background info @Alex Crichton !!! I think the steps you outlined sound good

Last update: Nov 22 2019 at 05:00UTC