Stream: t-lang/wg-unsafe-code-guidelines

Topic: zero-sized dangling accesses/inbounds-offsets


RalfJ (Feb 23 2019 at 10:47, on Zulip):

Is the following code UB or not?

fn main() {
    let mut b = Box::new((((),), 4));
    let x: *mut ((),) = &mut b.0;
    drop(b);
    unsafe {
        // getelementptr inbounds with offset 0 of a dangling pointer
        let x_inner: *mut () = &mut (*x).0;
        // 0-sized access of a dangling pointer
        let _val = *x;
    }
}

https://github.com/rust-rfcs/unsafe-code-guidelines/issues/93

RalfJ (Feb 23 2019 at 10:48, on Zulip):

Notice that this better not be UB, we do this a lot in libstd:

fn main() {
    let x: *mut ((),) = 1usize as *mut _;
    unsafe {
        // getelementptr inbounds with offset 0 of a dangling pointer
        let x_inner: *mut () = &mut (*x).0;
        // 0-sized access of a dangling pointer
        let _val = *x;
    }
}
nikomatsakis (Feb 28 2019 at 16:12, on Zulip):

my sense is that a zero-sized load ought to be a "no-op" and hence never UB... is that too simplistic?

Nicole Mazzuca (Feb 28 2019 at 16:15, on Zulip):

that's kind of my understanding, except that maybe alignment should be checked

Nicole Mazzuca (Feb 28 2019 at 16:16, on Zulip):

(although now that I think about it... probably best to check it at reference creation?)

nikomatsakis (Feb 28 2019 at 16:21, on Zulip):

Hmm, yes, maybe, although don't we create a "dummy pointer" like 0x1 or something for these?

nikomatsakis (Feb 28 2019 at 16:21, on Zulip):

I forget

nikomatsakis (Feb 28 2019 at 16:21, on Zulip):

I guess maybe we should just change the dummy pointer to 0x1000 :)

nikomatsakis (Feb 28 2019 at 16:22, on Zulip):

or whatever the alignment is

rkruppe (Feb 28 2019 at 16:28, on Zulip):

I haven't heard anyone disagreeing that zero-sized accesses should be NOPs, the real tricky part are GEP-inbounds operations which can happen not only in the presence of ZSTs but also when you have e.g. an empty slice of non-ZST element type, so we can't simply side-step the LLVM questions surrounding those by not emitting LLVM IR for them

RalfJ (Feb 28 2019 at 16:30, on Zulip):

from my understanding, zero-sized accesses do have to be aligned and non-NULL. that's consistent with other 0-sized operations.

RalfJ (Feb 28 2019 at 16:30, on Zulip):

but yes, the real question here is GEPi

rkruppe (Feb 28 2019 at 16:31, on Zulip):

Yes, using "NOP" loosely here. The dangling-ness shouldn't be an issue is what I meant

RalfJ (Feb 28 2019 at 16:31, on Zulip):

but @rkruppe I think an empty slice of non-ZST types is no problem, as you'll not GEPi into it

RalfJ (Feb 28 2019 at 16:31, on Zulip):

a non-empty slice of ZST types, however, is a problem

rkruppe (Feb 28 2019 at 16:32, on Zulip):

hm, you're right, i think i mixed something up
but if ZSTs are involved, we know they're ZSTs and can modify our IR generation, can't we?

RalfJ (Feb 28 2019 at 16:33, on Zulip):

hm...

RalfJ (Feb 28 2019 at 16:34, on Zulip):

really I'd rather make LLVM change their GEPi semantics to delay the UB further -- OOB arithmetic doesnt go poison but "taints" the pointer such that using it for an actual access is UB

rkruppe (Feb 28 2019 at 16:34, on Zulip):

oh no I remember now, slicing an empty slice was the issue: https://github.com/rust-lang/rust/issues/54857

nagisa (Feb 28 2019 at 16:34, on Zulip):

We already generate different code in rustc in presence of ZSTs IIRC

RalfJ (Feb 28 2019 at 16:34, on Zulip):

that'd solve all problems, and to my knowledge preserves everything the alias analysis needs

RalfJ (Feb 28 2019 at 16:34, on Zulip):

but well...

RalfJ (Feb 28 2019 at 16:34, on Zulip):

oh no I remember now, slicing an empty slice was the issue: https://github.com/rust-lang/rust/issues/54857

oh right we can always do that

nagisa (Feb 28 2019 at 16:34, on Zulip):

not because we’re avoiding UB or anything of the sort, but rather simply to avoid unnecessary IR.

RalfJ (Feb 28 2019 at 16:35, on Zulip):

well really what I'd like to know is what the rules on the MIR side are. and then we can think about how to compile that to LLVM.

RalfJ (Feb 28 2019 at 16:35, on Zulip):

and I see no good way to statically detect the MIR-level place projections that might be 0 and allow dangling pointers

rkruppe (Feb 28 2019 at 16:49, on Zulip):

My unsubstantiated gut feeling is I don't feel we really need anything as tricky as GEPi for the optimizations that might reasonably happen at MIR level, it's more something we want to preserve for the later stages of codegen. However, for those later stages, it appears that it's not just AA that cares. Mostly it appears LLVM passes care about the "no wraparound" aspect.

Nicole Mazzuca (Feb 28 2019 at 17:22, on Zulip):

@nikomatsakis we use heap::Empty<T>, which is align_of<T> as *mut T

Nicole Mazzuca (Feb 28 2019 at 17:22, on Zulip):

aiui

Last update: Nov 19 2019 at 18:35UTC