Stream: t-compiler/wg-mir-opt

Topic: double deallocation


Alexander Regueiro (Jul 29 2019 at 15:30, on Zulip):

Any idea why transferring over the memory contents (fields of Memory bar the tcx) could lead to a "deallocated twice" miri eval error?

Alexander Regueiro (Jul 29 2019 at 15:30, on Zulip):

transferring between different InterpCxs, that is

Alexander Regueiro (Jul 29 2019 at 15:31, on Zulip):

the second InterpCx would normally eval fine, with "empty" memory

Alexander Regueiro (Jul 29 2019 at 15:31, on Zulip):

@RalfJ maybe?

Alexander Regueiro (Jul 29 2019 at 15:38, on Zulip):

maybe I need to remap AllocIds?

RalfJ (Jul 29 2019 at 15:41, on Zulip):

uh... transferring things between InterpCx is not really supported^^

RalfJ (Jul 29 2019 at 15:42, on Zulip):

"deallocated twice" means it tried to deallocate something that wasnt there, which usually can only mean it has already been deallocated before. in your case it maybe means it hasnt been copied, or so.

Alexander Regueiro (Jul 29 2019 at 15:52, on Zulip):

@RalfJ yeah, I thought so. was trying to see if I could add support though. my solution now is very naive so... anyway, I'm just wondering why the MIR for the second InterpCx is even trying to access the old memory... maybe because the new AllocId accidentally point to it?

RalfJ (Jul 29 2019 at 15:57, on Zulip):

MIR always accesses the memory of the current InterpCx

Alexander Regueiro (Jul 29 2019 at 15:58, on Zulip):

@RalfJ yeah, but I have fed the old InterpCx' Memory into the new InterpCx's (tcx aside, which is fine because there are no dependencies on the tcx AFAICT)

Alexander Regueiro (Jul 29 2019 at 15:59, on Zulip):

i.e. copied over the fields alloc_map, extra_fn_ptr_map, dead_alloc_map, extra

Alexander Regueiro (Jul 29 2019 at 15:59, on Zulip):

which is evidently too naive.

Alexander Regueiro (Jul 29 2019 at 16:59, on Zulip):

@RalfJ okay... AllocIddoesn't seem like the problem actually. I think I need to copy over tcx.alloc_map too.

Alexander Regueiro (Jul 29 2019 at 17:22, on Zulip):

what role does alloc_map play exactly?is it separate from the Memory?

Alexander Regueiro (Jul 29 2019 at 18:03, on Zulip):

I guess it's only for consts and statics...?

RalfJ (Jul 30 2019 at 07:26, on Zulip):

RalfJ okay... AllocIddoesn't seem like the problem actually. I think I need to copy over tcx.alloc_map too.

wait you didnt just create a new InterpCx, you created a new tcx?

RalfJ (Jul 30 2019 at 07:26, on Zulip):

that's... a whole different deal.^^ would have been worth mentioning that...

RalfJ (Jul 30 2019 at 07:26, on Zulip):

AFAIK literally everything in the compiler assumes there is exactly one instance of it

RalfJ (Jul 30 2019 at 07:27, on Zulip):

the global alloc_map is where statics and consts go after they have been computed (that's "interning"). it's also where fn ptr allocations go (so that a Pointer can point to a function)

oli (Jul 30 2019 at 08:51, on Zulip):

Yea you can't just keep these in memory... you need to serialize them and "refill" them afterwards

oli (Jul 30 2019 at 08:51, on Zulip):

just like incremental does it

oli (Jul 30 2019 at 08:51, on Zulip):

otherwise all kinds of things break

oli (Jul 30 2019 at 08:52, on Zulip):

most notably the Allocation field of ByRef will be a dangling reference

Alexander Regueiro (Jul 30 2019 at 16:14, on Zulip):

RalfJ okay... AllocIddoesn't seem like the problem actually. I think I need to copy over tcx.alloc_map too.

wait you didnt just create a new InterpCx, you created a new tcx?

No no, I created a new InterpCx... but I just copied over all overMemory (minus the txc) into the new instance of InterpCxz.

Alexander Regueiro (Jul 30 2019 at 16:14, on Zulip):

the global alloc_map is where statics and consts go after they have been computed (that's "interning"). it's also where fn ptr allocations go (so that a Pointer can point to a function)

the tcx's alloc_map you mean, not Memory's, right? that would make sense.

Alexander Regueiro (Jul 30 2019 at 16:16, on Zulip):

Yea you can't just keep these in memory... you need to serialize them and "refill" them afterwards

@oli Yeah I kind of worried that would be the case... makes sense. Apparently incremental will already handle the restoration of statics/consts in tcx.alloc_map, but interpret::Memory I have to handle myself... is there an existing way to handle serialiasation and deserialisation of that?

oli (Jul 30 2019 at 16:32, on Zulip):

No

oli (Jul 30 2019 at 16:35, on Zulip):

Note that you can't just serialize the memory as deserialization will duplicate all allocations to the global alloc_map

Alexander Regueiro (Jul 30 2019 at 16:35, on Zulip):

Hmm

oli (Jul 30 2019 at 16:35, on Zulip):

You literally need to do it all manually

Alexander Regueiro (Jul 30 2019 at 16:35, on Zulip):

right

oli (Jul 30 2019 at 16:36, on Zulip):

Or use serde and hack something so alloc IDs get created anew during deserialization

oli (Jul 30 2019 at 16:36, on Zulip):

But do not use rustc's serialization scheme

Alexander Regueiro (Jul 30 2019 at 16:36, on Zulip):

right

Alexander Regueiro (Jul 30 2019 at 16:37, on Zulip):

@oli Not even thee serialisation infrastructure for crate metadata or whanot?

oli (Jul 30 2019 at 16:37, on Zulip):

Hmm actually, if you don't care about memory usage that would work

Alexander Regueiro (Jul 30 2019 at 16:37, on Zulip):

(or incremental)

oli (Jul 30 2019 at 16:37, on Zulip):

Iirc miri duplicates all static memory locally on write

Alexander Regueiro (Jul 30 2019 at 16:37, on Zulip):

the serialisation format is inefficient eh?

oli (Jul 30 2019 at 16:38, on Zulip):

Inefficient? No, just fine tuned to a specific use case

Alexander Regueiro (Jul 30 2019 at 16:38, on Zulip):

okay

oli (Jul 30 2019 at 16:39, on Zulip):

What I mean is you can just dump all allocations via rust serialization and after deserialization don't deserialize into Memory

oli (Jul 30 2019 at 16:39, on Zulip):

Once miri touches memory, it will fetch the allocations from the global memory when you try writing to it

oli (Jul 30 2019 at 16:40, on Zulip):

Not sure if that works for any memory, but you can check out the copy on write scheme to see if it does what you need

oli (Jul 30 2019 at 16:40, on Zulip):

That will duplicate any memory you modify in a single run

Alexander Regueiro (Jul 30 2019 at 16:40, on Zulip):

I see

oli (Jul 30 2019 at 16:40, on Zulip):

Should be fairly minimal for a repl though, right?

Alexander Regueiro (Jul 30 2019 at 16:41, on Zulip):

in theory yep

Alexander Regueiro (Jul 30 2019 at 16:42, on Zulip):

when you say "don't deserialise into Memory"... you mean only deserialise into the gcx alloc_map?

Alexander Regueiro (Jul 30 2019 at 16:42, on Zulip):

and Miri handles the rest?

oli (Jul 30 2019 at 16:47, on Zulip):

Just deserialize the alloc ids

oli (Jul 30 2019 at 16:47, on Zulip):

Rustc does the rest

Alexander Regueiro (Jul 30 2019 at 16:47, on Zulip):

Oh, interesting

oli (Jul 30 2019 at 16:48, on Zulip):

Actually

oli (Jul 30 2019 at 16:48, on Zulip):

Ignore memory

oli (Jul 30 2019 at 16:48, on Zulip):

Just serialize your stack

oli (Jul 30 2019 at 16:48, on Zulip):

It contains everythjng

Alexander Regueiro (Jul 30 2019 at 16:51, on Zulip):

@oli what does the stack contain that I need, that Memory doesn't contain?

oli (Jul 30 2019 at 16:55, on Zulip):

AllocIds

oli (Jul 30 2019 at 16:55, on Zulip):

And only live ones

oli (Jul 30 2019 at 16:55, on Zulip):

Leaked memory won't get serialized

Alexander Regueiro (Jul 30 2019 at 16:55, on Zulip):

as for serialising/deserialising AllocIds, isn't the issue that a) they can change between compilation sessions, b) the gcx alloc_map may not store the allocations for all miri allocs, only statics/consts, c) inc. comp. may not serialise all over the gcx alloc_map even if everything I need is there?

Alexander Regueiro (Jul 30 2019 at 16:55, on Zulip):

oh

oli (Jul 30 2019 at 16:55, on Zulip):

Don't worry, the serializer knows what it's doing

oli (Jul 30 2019 at 16:56, on Zulip):

It doesn't actually serialize alloc ids

oli (Jul 30 2019 at 16:56, on Zulip):

It just serializes the backing allocation or function reference

Alexander Regueiro (Jul 30 2019 at 16:56, on Zulip):

I see

Alexander Regueiro (Jul 30 2019 at 16:57, on Zulip):

Excuse my ignorance... which serialiser is this? the inc. comp. one?

Alexander Regueiro (Jul 30 2019 at 16:58, on Zulip):

(which I guess I'd have to use independently from inc. comp.)

Alexander Regueiro (Jul 30 2019 at 16:58, on Zulip):

I've done this already for types, so it should be okay

Alexander Regueiro (Jul 30 2019 at 18:10, on Zulip):

^ @oli (in case you didn't see, sorry)

oli (Jul 30 2019 at 18:52, on Zulip):

Metadata or incremental, doesn't matter

oli (Jul 30 2019 at 18:52, on Zulip):

Both use the same logic

Alexander Regueiro (Jul 30 2019 at 21:33, on Zulip):

Yeah, that was my impression too from previous high-level reading

Alexander Regueiro (Jul 30 2019 at 21:42, on Zulip):

@oli just serialising/deserialising the AllocIds will be a problem though, if gcx.alloc_map doesn't contain every miri allocation already, and that gets saved in the inc. comp. cache?

oli (Jul 31 2019 at 06:02, on Zulip):

Oh right

oli (Jul 31 2019 at 06:03, on Zulip):

Nevermind then

RalfJ (Jul 31 2019 at 10:15, on Zulip):

the global alloc_map is where statics and consts go after they have been computed (that's "interning"). it's also where fn ptr allocations go (so that a Pointer can point to a function)

the tcx's alloc_map you mean, not Memory's, right? that would make sense.

but what is the new InterpCx's tcx? If it is not the same as the old one's that means you got two tcx, which means you had to create a second one of those as well^^. And that's an extremely unsupported situation, to my knowledge.

oli (Jul 31 2019 at 10:59, on Zulip):

If you use rustc's serialization and deserialization then everything will work out

oli (Jul 31 2019 at 11:00, on Zulip):

It won't serialize the tcx, and during deserialization it'll refill it with the actual one

Alexander Regueiro (Jul 31 2019 at 15:12, on Zulip):

@RalfJ @oli Yeah, I thought it was unsupported too... but what Oli says seems reassuring. So, to summarise and make sure I understood this right:
1. Serialise every (relevant) Allocation in machine.memory().alloc_map, by transitively walking it (like dump_allocs). Use the encoding for inc. comp. / metadata.
2. Serialise the AllocIds for live locals (or even the entire stack) using the same method as above basically.
3. With the subsequent InterpCx & tcz, deserialise the above manually, making calls to allocate_with` (I guess).

Alexander Regueiro (Jul 31 2019 at 15:12, on Zulip):

hopefully I'm not too off-track.

oli (Jul 31 2019 at 15:29, on Zulip):

No

oli (Jul 31 2019 at 15:29, on Zulip):

Do not manually serialize any allocation

oli (Jul 31 2019 at 15:29, on Zulip):

Just serialize the stack

oli (Jul 31 2019 at 15:30, on Zulip):

That will serialize the live allocations

oli (Jul 31 2019 at 15:31, on Zulip):

But as I said earlier, check whether memory supports cow for any alloc

Alexander Regueiro (Jul 31 2019 at 19:20, on Zulip):

Just serialize the stack

Not everything is RustcEncodable in the stack, but I could maybe make it so...?

Alexander Regueiro (Jul 31 2019 at 19:21, on Zulip):

@oli I wouldn't know how to check for COW to be honest...

oli (Jul 31 2019 at 19:21, on Zulip):

Just serialize the stack

Not everything is RustcEncodable in the stack, but I could maybe make it so...?

yes you'd need to do that

oli (Jul 31 2019 at 19:22, on Zulip):

oli I wouldn't know how to check for COW to be honest...

check what happens when get_mut can't find an alloc

Alexander Regueiro (Jul 31 2019 at 19:23, on Zulip):

aha

Alexander Regueiro (Jul 31 2019 at 19:25, on Zulip):

@oli it looks for a static (i.e. global tcx) allocation in such a case.

oli (Jul 31 2019 at 19:27, on Zulip):

then it should all work out

Alexander Regueiro (Jul 31 2019 at 19:32, on Zulip):

@oli cool. because the global tcx stores all miri allocations and gets serialised to incremental already, I take it?

RalfJ (Jul 31 2019 at 19:34, on Zulip):

the global tcx only stores those allocations that end up in the final value of a static/const

RalfJ (Jul 31 2019 at 19:35, on Zulip):

temporary allocations that were created during CTFE (when computing said initial value) are not interned into the global tcx

oli (Jul 31 2019 at 19:37, on Zulip):

yea, but if you serialize the stack, everything ends up in the global tcx

Alexander Regueiro (Jul 31 2019 at 19:37, on Zulip):

ah, clever

Alexander Regueiro (Jul 31 2019 at 19:38, on Zulip):

@oli even if I'm serialising it to a separate store (not the inc. comp. cache)?

oli (Jul 31 2019 at 19:39, on Zulip):

yea, that doesn't change anything

Alexander Regueiro (Jul 31 2019 at 19:41, on Zulip):

super. will give that a try now. hopefully implementing RustcEncodable for those types will be as simple as adding derive's. thanks for the advice!

Alexander Regueiro (Aug 01 2019 at 03:09, on Zulip):

@oli Ugh, does the RustcEncodable proc macro not handle default types properly? it seems to be only implementing e.g. pub enum Operand<Tag = (), Id = AllocId> for Operand with its default type params, not generically.

Alexander Regueiro (Aug 01 2019 at 03:24, on Zulip):

I mention this to you because I saw you made at least one or two PRs to that proc macro in the past... but let me know if there's someone better to handle this. :-)

oli (Aug 01 2019 at 07:20, on Zulip):

I have no idea, sorry

Alexander Regueiro (Aug 01 2019 at 14:05, on Zulip):

no prob

Alexander Regueiro (Aug 03 2019 at 15:48, on Zulip):

@RalfJ I think you worked on this feature, so... is there any diff between mem::uninitialized and MaybeUninit::uninit().assume_init() presently?

RalfJ (Aug 03 2019 at 15:48, on Zulip):

no, not really. both are insta-UB or at least heavily unspecified for the vast majority of types.

RalfJ (Aug 03 2019 at 15:49, on Zulip):

see the assume_init docs

Alexander Regueiro (Aug 03 2019 at 15:50, on Zulip):

okay ta

Alexander Regueiro (Aug 03 2019 at 15:51, on Zulip):

@RalfJ would you tend to prefer one or the other?

centril (Aug 03 2019 at 15:52, on Zulip):

@Alexander Regueiro

pub unsafe fn uninitialized<T>() -> T {
    MaybeUninit::uninit().assume_init()
}
centril (Aug 03 2019 at 15:52, on Zulip):

literally

Alexander Regueiro (Aug 03 2019 at 15:52, on Zulip):

oh haha

centril (Aug 03 2019 at 15:52, on Zulip):

@Alexander Regueiro don't use uninitialized; it's getting deprecated

Alexander Regueiro (Aug 03 2019 at 15:52, on Zulip):

oh

Alexander Regueiro (Aug 03 2019 at 15:52, on Zulip):

I see

Alexander Regueiro (Aug 03 2019 at 15:52, on Zulip):

just because it's less explicit?

centril (Aug 03 2019 at 15:53, on Zulip):

in almost all cases you want to ::uninit() first, then do the initialization and finally .assume_init()

Alexander Regueiro (Aug 03 2019 at 15:54, on Zulip):

yeah, but this is for my REPL, where I'm doing crazy stuff internally ;-)

centril (Aug 03 2019 at 15:54, on Zulip):

MaybeUninit::uninit().assume_init() is sound and useful more or less for [MaybeUninit<T>; N]

centril (Aug 03 2019 at 15:54, on Zulip):

(except that case is also going away in favor of language improvements)

RalfJ (Aug 03 2019 at 18:43, on Zulip):

yeah, but this is for my REPL, where I'm doing crazy stuff internally ;-)

crazy stuff is okay. Undefined Behavior is not.

RalfJ (Aug 03 2019 at 18:43, on Zulip):

RalfJ would you tend to prefer one or the other?

both should not be used. again, see the assume_init docs. unless your type does not require initialziation, you might as well deref a NULL pointer -- same game.

Tom Phinney (Aug 03 2019 at 18:47, on Zulip):

To state it more bluntly, don't give the compiler free license to scramble your program. Their are better sources of random, unpredictable behavior.

Alexander Regueiro (Aug 03 2019 at 22:49, on Zulip):

@RalfJ yeah it won't be UB, because this is just to get the code to compile. when it gets interpreted, the machine will initialize the locals appropriately

RalfJ (Aug 04 2019 at 09:12, on Zulip):

@Alexander Regueiro I don't know what you mean... if that code is ever run, there is UB. or is this code you are, like, auto-generating and only running in the interpreter or so...?

Alexander Regueiro (Aug 04 2019 at 16:59, on Zulip):

Exactly. Generating this automatically, and it's only ever being run in a controlled interpreter environment.

Alexander Regueiro (Aug 04 2019 at 19:15, on Zulip):

@oli @RalfJ okay, so the error I get when deserialising the frame is on the line let pos = self.state.data_offsets[idx] as usize; in AllocDecodingSession::decode_alloc_id

Alexander Regueiro (Aug 04 2019 at 19:16, on Zulip):

kind of makes sense since data_offsets is initialised to an empty Vec

Alexander Regueiro (Aug 04 2019 at 22:46, on Zulip):

also, if I try to actually encode the AllocId like CacheEncoder does (using specialized_encode_alloc_id), then I get the panic 'no value for given alloc ID' since the ID isn't in the global tcx alloc map

Alexander Regueiro (Aug 04 2019 at 22:56, on Zulip):

maybe it needs to fall back to the InterpCx's Memory in that case somehow? hmm

Alexander Regueiro (Aug 06 2019 at 15:26, on Zulip):

no ideas guys? no worries if so, I'll keep experimenting. :-)

RalfJ (Aug 06 2019 at 16:01, on Zulip):

sorry this is far outside anything crazy I tried so far

Alexander Regueiro (Aug 06 2019 at 16:06, on Zulip):

hah fair enoiugh.

Alexander Regueiro (Aug 06 2019 at 18:51, on Zulip):

@RalfJ the issue seems to be that the global tcx alloc_map doesn't contain all the AllocIds for thee interpreter... is this right?

Alexander Regueiro (Aug 06 2019 at 18:51, on Zulip):

and if so, is there any way around this?

RalfJ (Aug 06 2019 at 18:53, on Zulip):

quoting myself from a few days ago:

the global tcx only stores those allocations that end up in the final value of a static/const

RalfJ (Aug 06 2019 at 18:54, on Zulip):

and if so, is there any way around this?

depends. for CTFE you could hack things to just always use the global allocator. it would kill rustc memory usage but do you care?^^
for Miri, the allocations in tcx have the wrong type. they cannot store Stacked Borrows state.

Alexander Regueiro (Aug 06 2019 at 22:37, on Zulip):

@RalfJ aha. yeah, that's what I thought... maybe @oli's suggested approach won't work then. let me try to hack around it though.

Alexander Regueiro (Aug 06 2019 at 22:37, on Zulip):

ta

Alexander Regueiro (Aug 06 2019 at 23:39, on Zulip):

yeah... the problem with the infrastructure around AllocId is that it's designed to only work with tcx allocs

Alexander Regueiro (Aug 06 2019 at 23:39, on Zulip):

thus only static stuff, no stacked borrows, etc.

Alexander Regueiro (Aug 06 2019 at 23:53, on Zulip):

perhaps if I extend AllocDecodingSession to work with interpret::Memory...

Last update: Nov 17 2019 at 07:00UTC