Stream: t-lang/wg-unsafe-code-guidelines

Topic: mutating non-mutable statics: UB?


Gankra (Aug 14 2019 at 13:20, on Zulip):

static FOO: u32 = 0;
let ptr = &FOO as *const u32 as *mut u32;
*ptr += 1;

is this distinctly UB, or just this fall out of rules for references? Or do we guarantee a safe crash, relying on the static being stuffed in rodata so that mutations cause page faults (in the same way that we rely on the stack's guard page to guarantee stack overflows or safe)?

Gankra (Aug 14 2019 at 13:20, on Zulip):

@RalfJ ^

gnzlbg (Aug 14 2019 at 13:29, on Zulip):

you are probably invoking UB before actually writing to the read-only memory

gnzlbg (Aug 14 2019 at 13:29, on Zulip):

by doing a write through a pointer derived from a &T

Gankra (Aug 14 2019 at 13:30, on Zulip):

if we get &raw that won't be necessary anymore

Gankra (Aug 14 2019 at 13:31, on Zulip):

naively it should be easy/fine to just guarantee an abort, just like with stackoverflow, but I don't know enough about linkers and exotic platforms to know if there's nightmares. Or if llvm makes it too hard to say "no really this is defined it's fine"

gnzlbg (Aug 14 2019 at 13:32, on Zulip):

writing to read-only memory is UB in the abstract machine

Gankra (Aug 14 2019 at 13:32, on Zulip):

is that written anywhere?

Gankra (Aug 14 2019 at 13:33, on Zulip):

also what is our definition of read-only memory. I do not expect the abstract machine to have a notion of like, OS pages. Does it literally only arise from immutable statics?

gnzlbg (Aug 14 2019 at 13:33, on Zulip):

in the reference

gnzlbg (Aug 14 2019 at 13:33, on Zulip):

read-only memory is memory that can only be read from

gnzlbg (Aug 14 2019 at 13:33, on Zulip):

it doesn't really matter whether it is put on read-write memory on the actual hardware or not

gnzlbg (Aug 14 2019 at 13:33, on Zulip):

it is read-only in the abstract machine

gnzlbg (Aug 14 2019 at 13:34, on Zulip):

if you want to avoid that, you can use interior mutability

Gankra (Aug 14 2019 at 13:34, on Zulip):

right so how does one declare readonly memory?

Gankra (Aug 14 2019 at 13:34, on Zulip):

just statics?

Gankra (Aug 14 2019 at 13:34, on Zulip):

declare/acquire

gnzlbg (Aug 14 2019 at 13:34, on Zulip):

static FOO: T; where T does not use interior mutability

Gankra (Aug 14 2019 at 13:34, on Zulip):

can you answer my question directly yes/no?

gnzlbg (Aug 14 2019 at 13:35, on Zulip):

i did answer it, it depends on the value that you put in the static

gnzlbg (Aug 14 2019 at 13:35, on Zulip):

static FOO: AtomicU32; can be mutated, so it cannot be in read-only memory, but static BAR: u32 cannot be mutated, so it can be

Gankra (Aug 14 2019 at 13:35, on Zulip):

i am asking if there's any way other than statics to get immutable memory

gnzlbg (Aug 14 2019 at 13:35, on Zulip):

probably

gnzlbg (Aug 14 2019 at 13:36, on Zulip):

you can ask the OS for some pages, write to them, and make them read only

Gankra (Aug 14 2019 at 13:36, on Zulip):

so OS pages are part of the abstract machine?

gnzlbg (Aug 14 2019 at 13:36, on Zulip):

no

gnzlbg (Aug 14 2019 at 13:36, on Zulip):

or not yet

Gankra (Aug 14 2019 at 13:36, on Zulip):

how does the model know/care that pages are readonly, then?

gnzlbg (Aug 14 2019 at 13:36, on Zulip):

it doesn't know about that

gnzlbg (Aug 14 2019 at 13:37, on Zulip):

so the abstract machine assumes they are all writable

gnzlbg (Aug 14 2019 at 13:37, on Zulip):

only for certain statics and other things can it assume that you are never writing to the memory

gnzlbg (Aug 14 2019 at 13:37, on Zulip):

e.g. let x = 0_i32 puts an i32 in memory that cannot be written to

gnzlbg (Aug 14 2019 at 13:38, on Zulip):

in the abstract machine

gnzlbg (Aug 14 2019 at 13:39, on Zulip):

what happens in actual hardware doesn't really matter

gnzlbg (Aug 14 2019 at 13:39, on Zulip):

rust will optimize under the assumption that you don't mutate immutable memory

centril (Aug 14 2019 at 13:39, on Zulip):

@Gankra for statics there's a Freeze trait in the compiler

Gankra (Aug 14 2019 at 13:39, on Zulip):

ok so if i memmap in a readonly page and write to it, is that "safe" or...?

gnzlbg (Aug 14 2019 at 13:40, on Zulip):

depends on what mmap says about doing that

gnzlbg (Aug 14 2019 at 13:41, on Zulip):

its an FFI function, with its own semantics, and it tells you what's safe or not

gnzlbg (Aug 14 2019 at 13:41, on Zulip):

if it tells you that you can do that, then you can do it

centril (Aug 14 2019 at 13:41, on Zulip):

@Gankra https://github.com/rust-lang/rust/blob/d19a359444295bab01de7ff44a9d72302e573bc9/src/libcore/marker.rs#L592-L604

gnzlbg (Aug 14 2019 at 13:42, on Zulip):

types that implement Freeze _can_ be put in read-only memory when used in statics

Gankra (Aug 14 2019 at 13:42, on Zulip):

yes that's not interesting and i understand

gnzlbg (Aug 14 2019 at 13:43, on Zulip):

whether they will actually be put in read-only memory is something that's not guaranteed, it would depend on the target (does it support that?)

gnzlbg (Aug 14 2019 at 13:43, on Zulip):

but from Rust point-of-view, you cannot mutate them either way

centril (Aug 14 2019 at 13:44, on Zulip):

(I think we should consider exposing Freeze)

Gankra (Aug 14 2019 at 13:44, on Zulip):

in case it wasn't clear this is Gankro i just changed my name, i am working on the page in the rustonomicon enumerating all the distinct classes of UB

Gankra (Aug 14 2019 at 13:45, on Zulip):

(feeling like I'm being talked down to...)

gnzlbg (Aug 14 2019 at 13:45, on Zulip):

(I think we should consider exposing Freeze)

@centril what problem would that solve that can't be solved with UnsafeCell<T> ?

centril (Aug 14 2019 at 13:47, on Zulip):

@gnzlbg let's start a new thread bout that

gnzlbg (Aug 14 2019 at 13:50, on Zulip):

@Gankra I'd just mention writing to immutable memory as the form of UB

gnzlbg (Aug 14 2019 at 13:51, on Zulip):

or maybe through an immutable binding

Gankra (Aug 14 2019 at 13:51, on Zulip):

i can't do that without a clear definition of what "immutable memory" means. specifically this can very easily cause misconceptions around memmap and what the language supports wrt faults

gnzlbg (Aug 14 2019 at 13:52, on Zulip):

memmap and faults are properties of a target/platform

gnzlbg (Aug 14 2019 at 13:52, on Zulip):

i don't think we guarantee anywhere that if you write to read-only memory you get a diagnostic

gnzlbg (Aug 14 2019 at 13:52, on Zulip):

in the same way that we can't guarantee that for, e.g., if you overflow the stack

centril (Aug 14 2019 at 13:52, on Zulip):

@Gankra I'd say it's better to be conservative and say that more things are not allowed than might be for now

gnzlbg (Aug 14 2019 at 13:53, on Zulip):

as far as i can tell, independently of the different ways that there are of creating immutable memory, all of them let you only access it from &T

Gankra (Aug 14 2019 at 13:53, on Zulip):

i agree and that's usually the philosophy of the nomicon, but this is a case where there are very real systems programming patterns at stake here, and not just weird trivia

gnzlbg (Aug 14 2019 at 13:53, on Zulip):

so the real UB is writing through an &T

Gankra (Aug 14 2019 at 13:53, on Zulip):

as a concrete example, someone was adding asserts that we don't mutate the empty singleton, and I said they shouldn't bother because it will fault

Gankra (Aug 14 2019 at 13:54, on Zulip):

(empty hashmap singleton)

gnzlbg (Aug 14 2019 at 13:54, on Zulip):

e.g. let mut x = 0; let x = x; *(&x as *mut) = 3; is UB, it doesn't really matter whether the x is in rw or ro memory

gnzlbg (Aug 14 2019 at 13:54, on Zulip):

the case of using raw is really worth addressing though

Gankra (Aug 14 2019 at 13:54, on Zulip):

yeah that falls out of the pointer aliasing rules, statics with &raw introduce a distinct class of UB

gnzlbg (Aug 14 2019 at 13:54, on Zulip):

yep

gnzlbg (Aug 14 2019 at 13:55, on Zulip):

I don't think anybody has mentioned this in the &raw RFC

gnzlbg (Aug 14 2019 at 13:55, on Zulip):

without &raw we don't really need to talk about ro or rw memory

gnzlbg (Aug 14 2019 at 13:55, on Zulip):

only say that writes through &T are not allowed unless T uses interior mutability

gnzlbg (Aug 14 2019 at 13:57, on Zulip):

(and as long as there aren't any data-races, etc.)

gnzlbg (Aug 14 2019 at 13:57, on Zulip):

with &raw you can create a raw pointer directly to a memory location without going through a &T and raw pointers don't really have these rules

gnzlbg (Aug 14 2019 at 13:57, on Zulip):

idk, feels like RFC material

centril (Aug 14 2019 at 13:58, on Zulip):

@gnzlbg it's probably best noted on the rfc

RalfJ (Aug 14 2019 at 16:08, on Zulip):

i am asking if there's any way other than statics to get immutable memory

you could count const promotion, but that's basically anonymous statics

RalfJ (Aug 14 2019 at 16:10, on Zulip):

as a concrete example, someone was adding asserts that we don't mutate the empty singleton, and I said they shouldn't bother because it will fault

we certainly don't guarantee a fault when writing to immutable memory

RalfJ (Aug 14 2019 at 16:10, on Zulip):

I'd say kernel allocations / mmap / ... are outside of what the abstract machine can talk about. but we can add a notion of "read-only allocation" to the abstract machine. Miri has that in fact.

RalfJ (Aug 14 2019 at 16:11, on Zulip):

that said, this notion is not needed with stacked borrows: the original code up there has UB because it writes through a (pointer derived from) a shared reference

RalfJ (Aug 14 2019 at 16:11, on Zulip):

&raw doesnt change that either, like as *const T it, too, produces a read-only pointer (except for interior mutability of course)

RalfJ (Aug 14 2019 at 16:13, on Zulip):

Also see https://github.com/rust-lang/rust/issues/56604 for the fact that right now, as *mut T and as *const T as *mut T are not the same

RalfJ (Aug 14 2019 at 16:13, on Zulip):

@Gankra ^

Gankra (Aug 14 2019 at 16:45, on Zulip):

ah ok, so the distinction of &raw and &raw mut is actually important and not just a jank artifact of *mut and *const

Gankra (Aug 14 2019 at 16:46, on Zulip):

(haven't actually read through the &raw proposal properly, mostly just assumed its content is obvious, since it's just there to fill a weird semantic hole we have)

RalfJ (Aug 14 2019 at 16:46, on Zulip):

ah ok, so the distinction of &raw and &raw mut is actually important and not just a jank artifact of *mut and *const

for Stacked Borrows right now it is. some would call that a bug.

RalfJ (Aug 14 2019 at 16:47, on Zulip):

but maybe it's actually reasonable.

Gankra (Aug 14 2019 at 16:47, on Zulip):

that does certainly mess up the counter-proposal i was mulling

RalfJ (Aug 14 2019 at 16:47, on Zulip):

well, "right now" isn't fair as &[mut] raw is not implemented -- but there is a difference between x as *const T and x as *mut T, and I think the same would happen with the raw-ref operator

Gankra (Aug 14 2019 at 16:48, on Zulip):

huh? really?

Gankra (Aug 14 2019 at 16:48, on Zulip):

i agree that that feels like a bug at first blush :)

RalfJ (Aug 14 2019 at 16:48, on Zulip):

the borrow checker treats them differently

Gankra (Aug 14 2019 at 16:48, on Zulip):

x is &mut T here?

RalfJ (Aug 14 2019 at 16:48, on Zulip):

yes

RalfJ (Aug 14 2019 at 16:49, on Zulip):

so to make Stacked Borrows, as it was designed to be, a dynamic version of the borrow checker, I saw no choice but also treat them differently in Stacked Borrows. it is rather nice in terms of model simplicity, but it looks strange when seen from the surface language.

Gankra (Aug 14 2019 at 16:49, on Zulip):

so basically as *const T acts as if it implicitly also contained as &T as a prefix?

RalfJ (Aug 14 2019 at 16:49, on Zulip):

see https://github.com/rust-lang/rust/issues/56604#issuecomment-477954315 for all the details, seems silly for me to type that all again^^

RalfJ (Aug 14 2019 at 16:49, on Zulip):

so basically as *const T acts as if it implicitly also contained as &T as a prefix?

yes

RalfJ (Aug 14 2019 at 16:50, on Zulip):

for borrow checking and Stacked Borrows alike

Gankra (Aug 14 2019 at 16:50, on Zulip):

hmm, so that definitely doesn't inherently motivate that behaviour for &raw

Gankra (Aug 14 2019 at 16:50, on Zulip):

it suggests it's worth considering though

Gankra (Aug 14 2019 at 16:51, on Zulip):

but i would probably need to spend several days familiarizing myself with the currently proposed rules for what you can do with a raw pointer :s

RalfJ (Aug 14 2019 at 16:52, on Zulip):

from a Stacked Borrows lense it does motivate that quite directly ;)

RalfJ (Aug 14 2019 at 16:52, on Zulip):

the action of a "raw const reborrow" already exists there, and is used for as *const T casts

RalfJ (Aug 14 2019 at 16:52, on Zulip):

I'd just use that same action for &raw

RalfJ (Aug 14 2019 at 16:52, on Zulip):

and, specifically, that action creates read-only pointers (modulo UnsafeCell)

Gankra (Aug 14 2019 at 16:52, on Zulip):

ugh, what an ugly language wart

RalfJ (Aug 14 2019 at 16:53, on Zulip):

it's actually less ugly in some sense than what I did before when It treated them the same^^

RalfJ (Aug 14 2019 at 16:53, on Zulip):

but i would probably need to spend several days familiarizing myself with the currently proposed rules for what you can do with a raw pointer :s

organizational question: that's mostly orthogonal to what my PR does, right?

RalfJ (Aug 14 2019 at 16:53, on Zulip):

like, &raw isnt going to be stable for quite a while

Gankra (Aug 14 2019 at 16:54, on Zulip):

yeah we can ignore the question for now

RalfJ (Aug 14 2019 at 16:54, on Zulip):

and given the current level of details on aliasing rules in there, I think we can fudge this together with that

Gankra (Aug 14 2019 at 16:54, on Zulip):

although, it does seem quite important to the raw RFC

RalfJ (Aug 14 2019 at 16:54, on Zulip):

we can always refine this later

RalfJ (Aug 14 2019 at 16:54, on Zulip):

although, it does seem quite important to the raw RFC

the problem being that the aliasing stuff is so far away from RFC-ready...

RalfJ (Aug 14 2019 at 16:55, on Zulip):

I think the question about &raw and about casts should have consistent answers anyway

RalfJ (Aug 14 2019 at 16:55, on Zulip):

(in fact I will lobby to remove ref-to-raw casts from MIR and desugar them to raw reborrows)

RalfJ (Aug 14 2019 at 16:55, on Zulip):

so really this was quite important when as *const T was introduced to the language, the question doesn't change much with &raw IMO

Gankra (Aug 14 2019 at 17:00, on Zulip):

@RalfJ i'm not sure I agree with your assessment in the linked comment that we need to appeal to the cast/coercion being performed. One can understand it as a lint, like the rest of *mut vs *const. Is there anything wrong with proposing a model where a pointer only has permissions equivalent to the reference it was derived from? In this case yes we are deriving from an &mut, but it's not a "real" &mut, it's one in "shared reference" mode. does that make sense?

Gankra (Aug 14 2019 at 17:04, on Zulip):

(full disclosure, I intend in the fullness of time to push for a third *T type (~equivalent to today's ptr::NonNull), with the intent that everyone's code should work fine using it, so it's a bit distressing to find a case where *const vs *mut is treated as meaningful)

RalfJ (Aug 14 2019 at 17:56, on Zulip):

Is there anything wrong with proposing a model where a pointer only has permissions equivalent to the reference it was derived from?

let x = &mut 0;
let shared = &*x;
let y: *const i32 = x;
let _val = *shared;
unsafe { *(y as *mut i32) = 1; }

is this UB?
this code stops compiling if you make y: *mut T. and making this code fine as-written would require significant re-architecting of stacked borrows. not saying that says we shouldn't do it, just saying that there is some way in which this is inherently more complicated than everything else stacked borrows does.

RalfJ (Aug 14 2019 at 17:57, on Zulip):

@Gankra "new permissions depend on old permission" is kind of complicated, in the sense that "old permission" can differ per-location. it also makes code harder to reason about, both for users and compilers.

RalfJ (Aug 14 2019 at 17:58, on Zulip):

but yes, this is what Stacked Borrows 1.0 did

RalfJ (Aug 14 2019 at 17:58, on Zulip):

but when Stacked Borrows 2 did more precise tracking for shared references, that didn't work any more

Gankra (Aug 14 2019 at 18:15, on Zulip):

I agree the const/mut distinction here is a useful way for you to write code that doesn't compile if you fuck it up, as I agree it can be difficult to reason about when shared loans expire. It is not however clear to me that we actually need the cast to be the thing that specifically acquires a particular permission, instead of the permission just getting "snapshot" at the point where the cast occurs. That said, I am starting to think that this doesn't make sense, as we now have non-lexical borrows, and so permission is explicitly driven by use. It could be that as *mut vs as *const specifically needs to be significant for purpose of extending borrow liveness. Not sure.

RalfJ (Aug 14 2019 at 18:17, on Zulip):

hm. interesting thought.

Gankra (Aug 14 2019 at 18:18, on Zulip):

This might also still actually jive okayish with my pointer unification proposal, if I make very precise tweeks to it.

Gankra (Aug 14 2019 at 18:21, on Zulip):

hmm, no you can't reasonably have const/mut decay to a unified *T as a coercion (as if they only really exist for the purposes of casts), because that would be asserting non-nullness as a coercion

centril (Aug 14 2019 at 18:22, on Zulip):

would have to be unsafe { ... }

Gankra (Aug 14 2019 at 18:23, on Zulip):

too easy to miss in a swamp of unsafe code

centril (Aug 14 2019 at 18:24, on Zulip):

that's why I think unsafe blocks should be as narrowly scoped as possible :slight_smile:

Gankra (Aug 14 2019 at 18:24, on Zulip):

(also just to be 100% clear on the example I gave about removing assertions on the assumption that mutating the static will crash: they were just intended as debug assertions to catch bugs more immediately)

Gankra (Aug 14 2019 at 18:31, on Zulip):

@RalfJ can anything go wrong if you accidentally have too much raw pointer permission? Does there exist a program under your model that is totally sound with as *const, but UB with as *mut, assuming literally every other line of code is the same? I should think not, right?

Gankra (Aug 14 2019 at 18:32, on Zulip):

unlike &mut vs &, it's just about a strict increase in permission, and not a trade of one permission for another? (shared ^ mut for references)

RalfJ (Aug 14 2019 at 18:46, on Zulip):

that's why I think unsafe blocks should be as narrowly scoped as possible :)

https://github.com/rust-lang/rfcs/pull/2585

RalfJ (Aug 14 2019 at 18:47, on Zulip):

RalfJ can anything go wrong if you accidentally have too much raw pointer permission? Does there exist a program under your model that is totally sound with as *const, but UB with as *mut, assuming literally every other line of code is the same? I should think not, right?

I would think not. this should be monotone.
but this is subtle enough that I won't say anything definite without my lawyer a proof

centril (Aug 14 2019 at 18:48, on Zulip):

https://github.com/rust-lang/rfcs/pull/2585

(Yeah, I'm pro, but I'd also like https://github.com/Centril/rfcs/pull/17 in that case)

RalfJ (Aug 14 2019 at 18:49, on Zulip):

unlike &mut vs &, it's just about a strict increase in permission, and not a trade of one permission for another? (shared ^ mut for references)

yeah. *mut is always SharedReadWrite; *const is either SharedReadWrite or SharedReadOnly depending on UnsafeCell

gnzlbg (Aug 14 2019 at 19:39, on Zulip):

so IIUC a usize as *const T as *mut T that writes would be UB, but usize as *mut T might be ok ?

gnzlbg (Aug 14 2019 at 19:40, on Zulip):

what if the usize points to immutable memory, and I do usize as *mut T ?

RalfJ (Aug 14 2019 at 19:44, on Zulip):

what if the usize points to immutable memory, and I do usize as *mut T ?

as *mut is only allowed from an &mut. so how do you do that?

gnzlbg (Aug 14 2019 at 19:47, on Zulip):

transmute then

RalfJ (Aug 14 2019 at 19:54, on Zulip):

you will in the end have derived that pointer from an &

RalfJ (Aug 14 2019 at 19:54, on Zulip):

and that's UB

RalfJ (Aug 14 2019 at 19:54, on Zulip):

I don't think you can avoid that

gnzlbg (Aug 14 2019 at 19:55, on Zulip):

makes sense

gnzlbg (Aug 14 2019 at 19:55, on Zulip):

I thought one could do an usize as *mut without going through an &mut T

RalfJ (Aug 14 2019 at 19:55, on Zulip):

but how do you get the usize?

gnzlbg (Aug 14 2019 at 19:56, on Zulip):

i think that if &raw returns a pointer from which no pointers that can write can be obtained, then all is good

RalfJ (Aug 14 2019 at 19:56, on Zulip):

at this point the result heavily depends on lots of details -- without an executable example program, the answer will always be "it depends"^^

gnzlbg (Aug 14 2019 at 19:56, on Zulip):

if you have an static variable, the usize with its address can be another static

gnzlbg (Aug 14 2019 at 19:56, on Zulip):

the linker can fill those appropriately (a bit of unnecessary, but bare with me)

RalfJ (Aug 14 2019 at 19:57, on Zulip):

heh, pointers created by CTFE ;) I thought about that. no idea what the rules should be.

Lokathor (Aug 15 2019 at 03:58, on Zulip):

how do you get that usize?

It could always be a fixed hardware address ;3

because you totally can write *(0x0400_0000_usize as *mut u16).write_volatile(1);, and however else you decide the rest of pointers works, Rust needs to support that sort of expression working on many targets of varying obscurity.

Lokathor (Aug 15 2019 at 04:03, on Zulip):

of course, I think that having a 98% complete memory model and saying "also the hardware is allowed to do its own extra things" is acceptable

Lokathor (Aug 15 2019 at 04:03, on Zulip):

so "that's platform specific" is fine

RalfJ (Aug 15 2019 at 08:31, on Zulip):

you are mixing "made-up" integer addresses and volatile. two complex subjects. yeah it'll be a bit until we get there. ;)

RalfJ (Aug 15 2019 at 08:31, on Zulip):

for "made-up" addresses, I think we can deal with them by making the Abstract Machine "open" -- it dos not assume total knowledge about which allocations exist in memory

Last update: Nov 19 2019 at 18:50UTC