Stream: t-compiler/wg-incr-comp

Topic: #79560 Internal Compiler Error while compiling diesel


view this post on Zulip Wesley Wiser (Dec 01 2020 at 12:21):

@Santiago Pastorino https://github.com/rust-lang/rust/blob/ffe52882ed79be67344dd6085559e308241e7f60/compiler/rustc_query_system/src/query/plumbing.rs#L581-L593

view this post on Zulip Wesley Wiser (Dec 01 2020 at 12:21):

#48923 looks strongly related to me

view this post on Zulip Wesley Wiser (Dec 01 2020 at 12:22):

https://github.com/rust-lang/rust/issues/48923#issuecomment-374681537

view this post on Zulip Wesley Wiser (Dec 01 2020 at 12:23):

Ident uses Symbol internally so this could be a very similar issue

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:32):

Wesley Wiser said:

Santiago Pastorino https://github.com/rust-lang/rust/blob/ffe52882ed79be67344dd6085559e308241e7f60/compiler/rustc_query_system/src/query/plumbing.rs#L581-L593

yeah, have seen this and the problem seems to be because of 2.

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:32):

Wesley Wiser said:

#48923 looks strongly related to me

taking a look at this ...

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:34):

opening a new thread for this ...

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:35):

on the issue it says ...

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:35):

" It can be compiled once (after "cargo clean"), with CARGO_INCREMENTAL=1"

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:35):

in my case it can't even be compiled once

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:35):

still reading ...

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:44):

@Wesley Wiser seems like a similar problem, yeah

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:44):

would need to debug this

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:47):

any debugging tip for this? other than just placing debug! calls here and there?

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 12:48):

unsure if we have some way to run and dump some incr compilation information

view this post on Zulip nikomatsakis (Dec 01 2020 at 13:14):

I don't know of any way to dump better info

view this post on Zulip nikomatsakis (Dec 01 2020 at 13:14):

but it does seem plausible that this could lead to the same problem

view this post on Zulip nikomatsakis (Dec 01 2020 at 13:15):

though I'm not sure

view this post on Zulip nikomatsakis (Dec 01 2020 at 13:15):

maybe not specifically a problem with gensym

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:00):

DEBUG rustc_middle::dep_graph::dep_node to_fingerprint: (test[317d]::Deserialize::deserialize, test[317d]::Deserialize::deserialize::D, Error)
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash1: 46388f61ff57330d-8ef732f8c03e2f10
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash2: e2b3ab1d0dd1b0c9-85e71e2a2f063f35
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident: a78981997a1f2967-8341e972ee8212d0
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint: (test[317d]::{impl#0}::deserialize, test[317d]::{impl#0}::deserialize::D, Error)
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash1: 2e00cbd8dce5e78b-e67cc8dccb8280b6
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash2: ebf144ecf4c8c8e7-aa4a2d4229f5bfa1
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident: f5d376ec0e1d1d43-8c120db5448a65bc
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint: (test[317d]::{impl#1}::deserialize, test[317d]::{impl#1}::deserialize::D, Error)
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash1: 2828145dc72efaea-74cbee32e3cb672e
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash2: 63ffbbeee9a2026e-59bd904979d916d
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident: f5d376ec0e1d1d43-8c120db5448a65bc

---

DEBUG rustc_middle::dep_graph::dep_node to_fingerprint: (test[317d]::Deserializer, Some(Error#0))
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash: 2440077b7e61b04a-f8b5b58c1f0bb9f0
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident: a78981997a1f2967-8341e972ee8212d0
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint: (test[317d]::Deserializer, Some(Error#7))
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash: 2440077b7e61b04a-f8b5b58c1f0bb9f0
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident: f5d376ec0e1d1d43-8c120db5448a65bc
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint: (test[317d]::Deserializer, Some(Error#8))
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash: 2440077b7e61b04a-f8b5b58c1f0bb9f0
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident: f5d376ec0e1d1d43-8c120db5448a65bc

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:00):

@Wesley Wiser @nikomatsakis :point_up:

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:01):

the first part is ok

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:02):

in the second part as you can see for the Ident Error#7 and Error#8 I'm getting the same fingerprint

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:06):

the code was ...

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:06):

trait Deserializer {
    type Error;
}

trait Deserialize {
    fn deserialize<D>(_: D) -> D::Error
    where
        D: Deserializer;
}

macro_rules! impl_deserialize {
    ($name:ident) => {
        impl Deserialize for $name {
            fn deserialize<D>(_: D) -> D::Error
            where
                D: Deserializer,
            {
                loop {}
            }
        }
    };
}

macro_rules! formats {
    {
        $($name:ident,)*
    } => {
        $(
            pub struct $name;

            impl_deserialize!($name);
        )*
    }
}
formats! { Foo, Bar, }

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:07):

I think I shouldn't be getting the same fingerprint because one's D::Error inside Foo's declaration and the other one is on Bar's declaration

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:31):

also ...

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:31):

DEBUG rustc_middle::dep_graph::dep_node to_fingerprint: (test[317d]::Deserializer, Some(Error#7))
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash: 2440077b7e61b04a-f8b5b58c1f0bb9f0
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident: f5d376ec0e1d1d43-8c120db5448a65bc
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident#hash: 2b4b3c2f82ba3d31-e4fadc0c321fe9c5
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint: (test[317d]::Deserializer, Some(Error#8))
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#def_path_hash: 2440077b7e61b04a-f8b5b58c1f0bb9f0
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident: f5d376ec0e1d1d43-8c120db5448a65bc
DEBUG rustc_middle::dep_graph::dep_node to_fingerprint#ident#hash: b32f1c09bee5a8ec-924268d9b1f8e736

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:32):

as I was guessing hash_stable is the same but hash differs

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:33):

impl<CTX> HashStable<CTX> for Symbol {
    #[inline]
    fn hash_stable(&self, hcx: &mut CTX, hasher: &mut StableHasher) {
        self.as_str().hash_stable(hcx, hasher);
    }
}

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:34):

it's clear why is this happening, what's not clear to me is how to fix it :)

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:35):

would I need to use InternedString

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:35):

like #49695

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:44):

I could be wrong, but I think this is sort of the opposite situation from #49695. In that one, we had two different Symbols that logically should have produced the same value but did not because their hygiene was different so using InternedString which just cares about the string value itself worked.

As you say, in this case, these actually refer to different things (the expanded Error for Foo and the expanded Error for Bar). But it looks like what's happening is that we're getting the same hash values (because it uses self.as_str()) when they really should be different. This leads to colliding fingerprints for different DepNodes

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:46):

It kind of seems to me that the bug is that <Symbol as HashStable>::hash_stable does not take hygiene or whatever into account.

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:46):

But perhaps fixing that will break lots of other parts of the compiler. IDK.

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:47):

yeah, what you're saying makes sense and it's right that InternetString won't be a solution

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:47):

I wouldn't change that, shouldn't we just wrap Ident in a different struct for this case or something like that?

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:48):

That seems like a pragmatic solution to me :smile:

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:48):

well, maybe we should pass a type that has some kind of context that makes these things different

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:48):

At the very least, it's worth a test to see if it resolves the issue. That would pretty much confirm what's happening.

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:48):

because these two things as Idents are the same if I'm not wrong

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:49):

Er yeah

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:49):

they involve Symbol and Span which are ==

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:49):

Wrap Symbol and implement HashStable correctly

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:49):

hmm what would that mean?

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:54):

Huh, I'd assumed the #8 part of Error#8 was some kind of scoping identifier or something but I don't see anything like that inside Symbol

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:54):

Is that actually just part of the string?

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:58):

I was confused by the same

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:58):

but I guess that's something related to the fact that there are two instances in memory of that type

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:58):

and actually the hash is different

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:58):

because the values in memory are different

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:59):

but when you hash_stable you get the same value as expected

view this post on Zulip Wesley Wiser (Dec 01 2020 at 20:59):

It can't be part of the Symbol text itself then because Error#7 and Error#8 get the same hash

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 20:59):

I'm not sure about Symbol but the Ident's hash are different

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 21:00):

what's the same is the hash_stable not the hash

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 21:00):

let me check the code again

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 21:01):

impl Hash for Ident {
    fn hash<H: Hasher>(&self, state: &mut H) {
        self.name.hash(state);
        self.span.ctxt().hash(state);
    }
}

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 21:02):

Symbol derives Hash

view this post on Zulip Santiago Pastorino (Dec 01 2020 at 21:02):

so yeah, Symbol hash is also different

view this post on Zulip Wesley Wiser (Dec 01 2020 at 21:04):

I believe Symbol::hash is implemented by hashing the pointer/index to the string in the intern table. <Symbol as HashStable>::hash_stable hashes the text content itself.

view this post on Zulip Wesley Wiser (Dec 01 2020 at 21:06):

If you can figure out where the #8 thing is coming from, that will probably tell you what data needs to be hashed as well as the string content.

view this post on Zulip Wesley Wiser (Dec 01 2020 at 21:09):

@nikomatsakis Might know off hand or be able to point you in the right direction as well as confirm if my analysis here is off base or not.

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:04):

So

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:04):

It seems like the problem is related to gensym, eh? Or at least hygiene

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:05):

I guess that @Vadim Petrochenkov has important context here

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:05):

one other note @Santiago Pastorino is that we maybe didn't want Ident in the first place, as I guess that includes a Span

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:06):

or...maybe I'm confused, sorry

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:06):

actually, I don't think I am

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:07):

Santiago Pastorino said:

would I need to use InternedString

I think using InternedString could actually work

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:08):

it is maybe not ideal

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:08):

the ideal would be probably to use Symbol, since hygiene info is relevant

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:09):

but if we passed in just the "plain text" and had a less precise comparison

view this post on Zulip nikomatsakis (Dec 01 2020 at 23:09):

that would still solve the bug

view this post on Zulip Vadim Petrochenkov (Dec 01 2020 at 23:17):

We no longer have gensym Symbols, all the hygiene bits are in Span now.

view this post on Zulip Vadim Petrochenkov (Dec 01 2020 at 23:17):

Symbol is just a string now.

view this post on Zulip Vadim Petrochenkov (Dec 01 2020 at 23:18):

InternedString is also gone.

view this post on Zulip Vadim Petrochenkov (Dec 01 2020 at 23:20):

Looks like it was renamed to SymbolStr, I don't quite remember when it happened.

view this post on Zulip mw (Dec 02 2020 at 13:35):

I can confirm that in the past these kinds of bugs were always related to StableHash impls ignoring information, often related to things like hygiene information.

view this post on Zulip Santiago Pastorino (Dec 02 2020 at 14:30):

ok, thanks everyone for sharing this info

view this post on Zulip Santiago Pastorino (Dec 02 2020 at 14:30):

I don't know what kind of hygiene info is being ignored in this case

view this post on Zulip Santiago Pastorino (Dec 02 2020 at 14:31):

well as Niko have said, I may need to not use Ident and change to something else

view this post on Zulip Santiago Pastorino (Dec 02 2020 at 14:31):

but I have no idea what to use instead to fix this problem and be able to provide all the needed information

view this post on Zulip Wesley Wiser (Dec 02 2020 at 14:36):

@Santiago Pastorino As petrochenkov said, I think you want to use SymbolStr

view this post on Zulip Santiago Pastorino (Dec 02 2020 at 14:37):

can try that but don't understand why that would solve the issue

view this post on Zulip Santiago Pastorino (Dec 02 2020 at 14:38):

I meant, I guess those hash stables would differ but unsure why/how

view this post on Zulip Santiago Pastorino (Dec 02 2020 at 14:40):

impl<CTX> HashStable<CTX> for SymbolStr {
    #[inline]
    fn hash_stable(&self, hcx: &mut CTX, hasher: &mut StableHasher) {
        self.string.hash_stable(hcx, hasher)
    }
}

view this post on Zulip Santiago Pastorino (Dec 02 2020 at 14:40):

seems like unless I encode some information inside the string it will return the same hash_stable

view this post on Zulip mw (Dec 03 2020 at 09:18):

The hashing mismatch might creep in through the special HashStable impl for Span (which is a field of Ident): https://github.com/rust-lang/rust/blob/c7cff213e937c1bb301be807ce04fcf6092b9163/compiler/rustc_span/src/lib.rs#L1853-L1931

view this post on Zulip mw (Dec 03 2020 at 09:19):

All "invalid" spans hash to the same value in HashStable, but Hash and Eq are based on the spans actual bits.

view this post on Zulip mw (Dec 03 2020 at 09:19):

So using Symbol or SymbolStr instead of Ident might solve the problem.

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 14:59):

mw said:

The hashing mismatch might creep in through the special HashStable impl for Span (which is a field of Ident): https://github.com/rust-lang/rust/blob/c7cff213e937c1bb301be807ce04fcf6092b9163/compiler/rustc_span/src/lib.rs#L1853-L1931

I'm clearly not following something here or in general

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 15:00):

I'm getting two "different" Idents to give the same hash_stable

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 15:00):

@lcnr was telling me the same thing as you

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 15:01):

Ident is Symbol + Span and if I'm getting the same hash_stable I don't understand how getting rid of Span would now give different hash_stable values

view this post on Zulip bjorn3 (Dec 03 2020 at 15:01):

Error#7 and Error#8 are the same Ident (Error), but the disambiguator, which should also be included in hash_stable is different.

view this post on Zulip bjorn3 (Dec 03 2020 at 15:02):

https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/definitions/struct.DisambiguatedDefPathData.html

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 15:38):

bjorn3 said:

Error#7 and Error#8 are the same Ident (Error), but the disambiguator, which should also be included in hash_stable is different.

so you meant something different to what @mw and @lcnr are saying, what you're suggesting is that hash_stable implementation for Ident should include the disambiguator information?

view this post on Zulip bjorn3 (Dec 03 2020 at 15:40):

No, the Ident should be identical (except for maybe the Span), but the DisambiguatedDefPathData, which contains the disambiguator should be different and thus result in a different hash_stable.

view this post on Zulip Esteban Küber (Dec 03 2020 at 15:50):

Summary of my understanding quickly reading which I think explains the confusion: Ident has a span and the error is understood by some to be that two different errors have the same ident (because their name and span are equivalent, but are supposed to be different), while others understand that two idents that should Eq, don't purely because of the span.
If the later is the case, then using SymbolStr instead does fix the issue.

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:06):

in cases like this is where I again confirm that I have severe english understanding issues :'(

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:07):

bjorn3 said:

No, the Ident should be identical (except for maybe the Span), but the DisambiguatedDefPathData, which contains the disambiguator should be different and thus result in a different hash_stable.

ok, I got what you meant, but I'm not sure what are you suggesting. Are you suggesting to use the disambiguator?

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:10):

Esteban Küber said:

Summary of my understanding quickly reading which I think explains the confusion: Ident has a span and the error is understood by some to be that two different errors have the same ident (because their name and span are equivalent, but are supposed to be different), while others understand that two idents that should Eq, don't purely because of the span.
If the later is the case, then using SymbolStr instead does fix the issue.

I'm in the group that understand what you've said first.

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:11):

my problem is with " then using SymbolStr instead does fix the issue" I'm not sure how you all are concluding this

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:11):

given that most of you suggest that I'd guess that's correct but I have no idea where this conclusion comes from

view this post on Zulip Wesley Wiser (Dec 03 2020 at 19:14):

nikomatsakis said:

but if we passed in just the "plain text" and had a less precise comparison
that would sill solve the bug

I think this is is the part that got missed. Niko is saying that just comparing the plain text should solve the bug and that's what SymbolStr does.

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:15):

I think the key is ... do we want these two idents to be different and be interpreted as different things by incr comp? or do we want them to be the same and be interpreted as the same thing by incr comp?

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:15):

what's currently going on seems to me that both things are different but have the same hash

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:15):

so it's neither of the things that would make incr comp happy

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:16):

Santiago Pastorino said:

I think the key is ... do we want these two idents to be different and be interpreted as different things by incr comp? or do we want them to be the same and be interpreted as the same thing by incr comp?

still these two things are semantically different

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:16):

and I'm not sure which one is right

view this post on Zulip Wesley Wiser (Dec 03 2020 at 19:16):

Yeah that's definitely the issue

view this post on Zulip Wesley Wiser (Dec 03 2020 at 19:16):

My intuition was that we should make sure they compare differently if they are different things but it sounds like that difference does not matter in this case so it's ok if we treat them as equal.

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:18):

yeah that's right but ...

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:18):

as I've said, should these two things be equal and have the same hash or should they be different and have different hashes?

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:19):

what that would semantically mean in terms of incr comp would be different

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:19):

in the sense that equals things would be reused

view this post on Zulip Santiago Pastorino (Dec 03 2020 at 19:19):

so do I want that what format!(Foo) generates be reused when format!(Bar) is called in that context?

view this post on Zulip Wesley Wiser (Dec 03 2020 at 19:26):

I think from what niko was saying, either option would work but making them equal with equal hashes via SymbolStr is probably going to be easier.

view this post on Zulip mw (Dec 04 2020 at 10:50):

I found out a bit more about how the hash mismatch happens: https://github.com/rust-lang/rust/issues/79560#issuecomment-738714717

view this post on Zulip mw (Dec 04 2020 at 10:51):

but I have too little insight into hygiene and macro expansion to propose a solution.


Last updated: Oct 21 2021 at 20:21 UTC