Stream: t-lang/wg-unsafe-code-guidelines

Topic: arch volatile operations


gnzlbg (Jun 27 2019 at 09:31, on Zulip):

It was tangentially discussed that we could add volatile load/stores to core::arch with stronger guarantees.

This is a sketch of how that could look like: https://gist.github.com/gnzlbg/c8ef62c12e692a420face245c5df7123

RalfJ (Jun 27 2019 at 17:11, on Zulip):

that's the tearing thing, right?
the alternative is to do what C++ did: some kind of compiler-implemented trait VolatileAccess { const DOES_NOT_TEAR: bool; }

RalfJ (Jun 27 2019 at 17:15, on Zulip):

@gnzlbg what does "with side-effects that freezes its result" mean? in particular I am confused by the "side-effects" part

RalfJ (Jun 27 2019 at 17:16, on Zulip):

is that any different from just "that freezes its result"?

RalfJ (Jun 27 2019 at 17:16, on Zulip):

in particular, this should not have the side-effect of freezing whatever is in memory at that position

gnzlbg (Jun 27 2019 at 17:16, on Zulip):

side-effects is that it is an unknown function

gnzlbg (Jun 27 2019 at 17:16, on Zulip):

freeze is that the value that is loaded, is frozen

gnzlbg (Jun 27 2019 at 17:17, on Zulip):

like:

{
   let r = *ptr;
   freeze(r)
}
gnzlbg (Jun 27 2019 at 17:17, on Zulip):

so even if the memory is undef, these never return undef

RalfJ (Jun 27 2019 at 17:17, on Zulip):

oh, so that's:

16/32/64-bit atomic relaxed load from aligned x
- with side-effects ( -> volatile)
- that freezes its result.

that wasnt clear :D

gnzlbg (Jun 27 2019 at 17:18, on Zulip):

yeah, i should use "and" or commas :P

RalfJ (Jun 27 2019 at 17:18, on Zulip):

I read it as "with (side-effects that freezes its result)"

gnzlbg (Jun 27 2019 at 17:18, on Zulip):

maybe "with side-effects and its result is frozen" ?

RalfJ (Jun 27 2019 at 17:19, on Zulip):

I dont think many people will get the "with side-effects" part

RalfJ (Jun 27 2019 at 17:19, on Zulip):

In the one-sentence version I'd just use the adjective "volatile"

RalfJ (Jun 27 2019 at 17:20, on Zulip):

and then add a paragraph explaining that "volatile" basically means the read is speciifed to have side effects that are unknown to the compiler, which implies it cannot be duplicated or removed

RalfJ (Jun 27 2019 at 17:20, on Zulip):

-> MMIO

gnzlbg (Jun 27 2019 at 17:20, on Zulip):

or reordered across other volatile operations

RalfJ (Jun 27 2019 at 17:21, on Zulip):

or reordered across syscalls or whatever

gnzlbg (Jun 27 2019 at 17:21, on Zulip):

when I started writing that, I used the term "synchronizes with volatile in the same thread of execution"

RalfJ (Jun 27 2019 at 17:21, on Zulip):

no that's "atomic" terminology

gnzlbg (Jun 27 2019 at 17:21, on Zulip):

that is, there is no "happens before" across threads, but within the same thread

RalfJ (Jun 27 2019 at 17:22, on Zulip):

which seems wrong here

gnzlbg (Jun 27 2019 at 17:22, on Zulip):

you can reorder some stuff around volatile loads

RalfJ (Jun 27 2019 at 17:22, on Zulip):

speaking of which:
"This operation is not atomic but it is data-race free."

here two meanings of the word "atomic" collide :/
in my world, being data-race free is the definition of "atomic"

gnzlbg (Jun 27 2019 at 17:22, on Zulip):

if you are accessing the memory using volatile and non volatile operations

RalfJ (Jun 27 2019 at 17:22, on Zulip):

sure, just like with other externally observable behavior

RalfJ (Jun 27 2019 at 17:23, on Zulip):

you dont needs happens-before to specify that

RalfJ (Jun 27 2019 at 17:24, on Zulip):

for the "not atomic but data-race-free", I'd say something like "this operation consists of multiple atomic operations -- so it causes no data races, but the result might be a mix of values written at different times".
or maybe "this operation is not atomic in the sense that it does not behave transactional, but it is atomic in the sense that it does not cause data races"

gnzlbg (Jun 27 2019 at 17:25, on Zulip):

it kind of says that

This operation is not atomic but it is data-race free.

The load from x is performed by (multiple) smaller or equally-wide volatile atomic
loads in an unspecified order.

gnzlbg (Jun 27 2019 at 17:26, on Zulip):

How about: The operation is not an atomic load, but it loads the memory in a data-race free way, by performing (potentially multiple) smaller or equally-wide volatile atomic loads in an unspecified order.

RalfJ (Jun 27 2019 at 17:27, on Zulip):

what about "is not a single atomic load"?^^

gnzlbg (Jun 27 2019 at 17:29, on Zulip):

The operation is not necessarily a single atomic load. The memory is read in a data-race free way by performing either a single volatile atomic load, or multiple smaller volatile atomic loads in an unspecified order .

gnzlbg (Jun 27 2019 at 17:33, on Zulip):

So @RalfJ i've updated the gist a bit.

gnzlbg (Jun 27 2019 at 17:33, on Zulip):

I think we could nail down the finer print, if this ever gets RFCed

RalfJ (Jun 27 2019 at 17:33, on Zulip):

sounds good

gnzlbg (Jun 27 2019 at 17:34, on Zulip):

I think we could retrofit read_volatile to have these semantics

RalfJ (Jun 27 2019 at 17:34, on Zulip):

what about this:

that's the tearing thing, right?
the alternative is to do what C++ did: some kind of compiler-implemented trait VolatileAccess { const DOES_NOT_TEAR: bool; }

RalfJ (Jun 27 2019 at 17:34, on Zulip):

your approach shares the problem of the Atomic* types that e.g. doing it on a (u8, u16) requires some awful code

gnzlbg (Jun 27 2019 at 17:34, on Zulip):

what should implement the trait?

RalfJ (Jun 27 2019 at 17:35, on Zulip):

everything that's small enough to be lowered to a single volatile atomic access

gnzlbg (Jun 27 2019 at 17:35, on Zulip):

note that because it freezes the result, it should have less problems than the atomics

RalfJ (Jun 27 2019 at 17:35, on Zulip):

fair

gnzlbg (Jun 27 2019 at 17:35, on Zulip):

maybe we should do that for the atomics to..

gnzlbg (Jun 27 2019 at 17:35, on Zulip):

so only aligned pointers are supported right?

RalfJ (Jun 27 2019 at 17:35, on Zulip):

that precludes "de-atomization" optimizations

RalfJ (Jun 27 2019 at 17:35, on Zulip):

supported for what?

gnzlbg (Jun 27 2019 at 17:35, on Zulip):

like the current read_volatile is UB if the pointer is not aligned

RalfJ (Jun 27 2019 at 17:36, on Zulip):

yes

gnzlbg (Jun 27 2019 at 17:36, on Zulip):

note that the last set of intrinsics in the gist, supports unaligned loads

RalfJ (Jun 27 2019 at 17:36, on Zulip):

people asked for a read_volatile_unaligned but I am not sure if it was ever added

RalfJ (Jun 27 2019 at 17:36, on Zulip):

looks like it was added as intrinsic but not exposed? weird

RalfJ (Jun 27 2019 at 17:36, on Zulip):

https://doc.rust-lang.org/nightly/core/intrinsics/fn.unaligned_volatile_load.html

gnzlbg (Jun 27 2019 at 17:37, on Zulip):

so if your pointer is unaligned, the chances that you get tearing are super high

gnzlbg (Jun 27 2019 at 17:37, on Zulip):

even on x86_64

RalfJ (Jun 27 2019 at 17:37, on Zulip):

if an operation does not require alignment you should say that explicitly

RalfJ (Jun 27 2019 at 17:37, on Zulip):

it's kind of the default to require alignment ;)

gnzlbg (Jun 27 2019 at 17:38, on Zulip):

ok so I should change the names in the gist

RalfJ (Jun 27 2019 at 17:38, on Zulip):

anyway I am not saying I have a preference either way, just pointing out that C++ followed a different route

gnzlbg (Jun 27 2019 at 17:38, on Zulip):

i prefer to have aligned in the name, because that's something that the user should know about, unaligned means we accept everything

RalfJ (Jun 27 2019 at 17:38, on Zulip):

ok so I should change the names in the gist

yes, we usually only add unaligned explicitly but not aligned

RalfJ (Jun 27 2019 at 17:38, on Zulip):

I understand the sentiment but it's now how any other API works, so users just have to learn that

gnzlbg (Jun 27 2019 at 17:38, on Zulip):

that feels a bit like adding a _this_function_does_not_have_preconditions :D

gnzlbg (Jun 27 2019 at 17:39, on Zulip):

but I'll change that, so naming aside, the trait

gnzlbg (Jun 27 2019 at 17:39, on Zulip):

that can work

RalfJ (Jun 27 2019 at 17:39, on Zulip):

or you convince t-libs to deprecate ptr::read in favor of ptr::read_aligned :D

gnzlbg (Jun 27 2019 at 17:39, on Zulip):

but we need to implement it for types of certain layout

RalfJ (Jun 27 2019 at 17:39, on Zulip):

yes it would be a weird trait

gnzlbg (Jun 27 2019 at 17:39, on Zulip):

so I don't know how we could do that

RalfJ (Jun 27 2019 at 17:40, on Zulip):

an alternative might be a const fn volatile_access_does_not_tear<T>() -> bool

RalfJ (Jun 27 2019 at 17:40, on Zulip):

maybe that makes more sense

gnzlbg (Jun 27 2019 at 17:40, on Zulip):

I think that such an API could be built on top of the core::arch intrinsics

gnzlbg (Jun 27 2019 at 17:40, on Zulip):

in a library

RalfJ (Jun 27 2019 at 17:40, on Zulip):

and then it's easy to look at the layout

gnzlbg (Jun 27 2019 at 17:40, on Zulip):

depends if you want to support all Ts with the same layout as e.g. u8, or you are ok with u8

gnzlbg (Jun 27 2019 at 17:41, on Zulip):

like if you have a different type, you only need *mut T as _ in the function call

RalfJ (Jun 27 2019 at 17:41, on Zulip):

the implementation of a polymorphic const fn can juts look at the TyLayout, check the Abi, and work with that

RalfJ (Jun 27 2019 at 17:41, on Zulip):

like if you have a different type, you only need *mut T as _ in the function call

what do you mean?

gnzlbg (Jun 27 2019 at 17:42, on Zulip):

I don't think we need const fn, we need something like what transmute did, before it was const fn (although these should be const fn)

RalfJ (Jun 27 2019 at 17:42, on Zulip):

...?

gnzlbg (Jun 27 2019 at 17:42, on Zulip):

i mean calling ,e.g., volatile_load_u16

RalfJ (Jun 27 2019 at 17:42, on Zulip):

I have no idea what you mean

RalfJ (Jun 27 2019 at 17:42, on Zulip):

or what problem you are even talking about^^

gnzlbg (Jun 27 2019 at 17:43, on Zulip):

you mentioned that if we don't have a trait, user code would be weird for (u8, u32), but it looks like this:

let x: (u8, u32);
let y: *const (u8, u32) = &x;
let r = core::arch::volatile_load_u32(y as _);
RalfJ (Jun 27 2019 at 17:43, on Zulip):

but now r has the wrong type

RalfJ (Jun 27 2019 at 17:44, on Zulip):

and also x as _ gives me shivers^^

RalfJ (Jun 27 2019 at 17:44, on Zulip):

raw ptr casts are already dangerous when they are fully spelled out...

gnzlbg (Jun 27 2019 at 17:44, on Zulip):

i mean, we could do a transmute

gnzlbg (Jun 27 2019 at 17:44, on Zulip):

internally

gnzlbg (Jun 27 2019 at 17:45, on Zulip):

core::arch::volatile_load_32(y); would cast internally y to a *const u32, do the load, and transmute back to T

RalfJ (Jun 27 2019 at 17:45, on Zulip):

volatile_load_u32<T>(x: *const T) -> T where mem::size_of::<T>() == 4?

gnzlbg (Jun 27 2019 at 17:45, on Zulip):

yeah

RalfJ (Jun 27 2019 at 17:45, on Zulip):

that would make u32 a misnomer

RalfJ (Jun 27 2019 at 17:45, on Zulip):

but, sure, many things one can try :D

gnzlbg (Jun 27 2019 at 17:45, on Zulip):

(I changed it above)

gnzlbg (Jun 27 2019 at 17:45, on Zulip):

but yeah

RalfJ (Jun 27 2019 at 17:45, on Zulip):

I have another concern though: calling this a "relaxed" access

RalfJ (Jun 27 2019 at 17:45, on Zulip):

I am not sure if that's correct

gnzlbg (Jun 27 2019 at 17:46, on Zulip):

i used that to save words

RalfJ (Jun 27 2019 at 17:46, on Zulip):

but that has a very specific technical meaning

RalfJ (Jun 27 2019 at 17:46, on Zulip):

and I dont think we want that

gnzlbg (Jun 27 2019 at 17:46, on Zulip):

otherwise I need wording about data-race freedom in the other variants

gnzlbg (Jun 27 2019 at 17:46, on Zulip):

well the accesses are atomic, and the ordering is relaxed (they don't synchronize)

RalfJ (Jun 27 2019 at 17:46, on Zulip):

release_fence();relaxed_store(x, 4); is "almost" the same as release_store(x, 4);

RalfJ (Jun 27 2019 at 17:46, on Zulip):

I dont think we want to guarantee that for volatile stores

RalfJ (Jun 27 2019 at 17:47, on Zulip):

relaxed does synchronize when it is program-order-before a release fence

RalfJ (Jun 27 2019 at 17:47, on Zulip):

well in some sense

RalfJ (Jun 27 2019 at 17:48, on Zulip):

what I mean is that you can implement synchronization with relaxed accesses and fences

RalfJ (Jun 27 2019 at 17:48, on Zulip):

and I dont know if we can guarnatee that you can implement synchronization with volatile accesses and fences.

RalfJ (Jun 27 2019 at 17:50, on Zulip):

ah I got the order wrong, dang^^ (fixed above)

gnzlbg (Jun 27 2019 at 17:50, on Zulip):

so I thought that a relaxed load / store could be reordered across a release fence

RalfJ (Jun 27 2019 at 17:50, on Zulip):

so the rule is something like: release-fence program-order-before relaxed store which is read-from a relaxed load which is program-order-before an acquire fence... then we have a happens-before between the fences

RalfJ (Jun 27 2019 at 17:51, on Zulip):

they can be reordered one way but not the other

RalfJ (Jun 27 2019 at 17:51, on Zulip):

you can NOT move a relaxed store UP to before a release-fence

RalfJ (Jun 27 2019 at 17:51, on Zulip):

because that would kill the synchronization in the rule I just mentioned

RalfJ (Jun 27 2019 at 17:52, on Zulip):

(I hope I am getting the details right, but I know for sure that some pattern like this is legal for programmers, and hence a restriction for compilers)

gnzlbg (Jun 27 2019 at 17:52, on Zulip):

so you are right

RalfJ (Jun 27 2019 at 17:53, on Zulip):

there's some explanation of this at http://plv.mpi-sws.org/fsl/base/paper.pdf... at least that's the best kind of explanation for me, YMMV ;)

gnzlbg (Jun 27 2019 at 17:53, on Zulip):

a release_fence followed by a relaxed store is a release_store, and a relaxed_load followed by a acquire_fence is a acquire_load

RalfJ (Jun 27 2019 at 17:53, on Zulip):

no that's not correct either

gnzlbg (Jun 27 2019 at 17:54, on Zulip):

aquire_load sorry

RalfJ (Jun 27 2019 at 17:54, on Zulip):

no thats not it

RalfJ (Jun 27 2019 at 17:54, on Zulip):

a release-fence followed by a relaxed-store will build up happens-before only with something followed by an acquire fence

RalfJ (Jun 27 2019 at 17:54, on Zulip):

whereas a release-store will build-up happens-before only with an acquire load to the same location

gnzlbg (Jun 27 2019 at 17:55, on Zulip):

a relaxed_load followed by an acquire_fence is an acquire_load

RalfJ (Jun 27 2019 at 17:55, on Zulip):

no

RalfJ (Jun 27 2019 at 17:55, on Zulip):

let me type out the counterexample

RalfJ (Jun 27 2019 at 17:55, on Zulip):

(you are right in x86/ARM but not in C/C++/Rust)

RalfJ (Jun 27 2019 at 17:56, on Zulip):

(well for x86 it's moot because TLS but whatever^^)

gnzlbg (Jun 27 2019 at 17:56, on Zulip):

coming back to your point

gnzlbg (Jun 27 2019 at 17:57, on Zulip):

if we use the term atomic relaxed, then the fences would synchronize with volatile operations, and we don't want that

RalfJ (Jun 27 2019 at 17:57, on Zulip):

Thread 1:

data = 32; // non-atomic
fence_release();
store_relaxed(flag, 1);

Thread 2:

while load_relaxed(flat) == 0 { }
fence_acquire();
print(data); // non-atomic
RalfJ (Jun 27 2019 at 17:57, on Zulip):

this code is okay and data-race-free

RalfJ (Jun 27 2019 at 17:58, on Zulip):

now after replacing the relaxed-store-followed-by-release-fence we have
Thread 1:

data = 32; // non-atomic
store_release(flag, 1);

Thread 2:

while load_relaxed(flat) == 0 { }
fence_acquire();
print(data); // non-atomic

This code has a data race and is UB.

RalfJ (Jun 27 2019 at 17:58, on Zulip):

and that's because fences only sync with fences

RalfJ (Jun 27 2019 at 17:58, on Zulip):

but there's just one fence here so nothing it can sync with

RalfJ (Jun 27 2019 at 17:59, on Zulip):

argh ordered my fences wrong again^^ fixing that

gnzlbg (Jun 27 2019 at 17:59, on Zulip):

hmm

gnzlbg (Jun 27 2019 at 18:00, on Zulip):

you would need to replace the code in thread 2 with an acquire_load to fix the data race

RalfJ (Jun 27 2019 at 18:00, on Zulip):

yes

gnzlbg (Jun 27 2019 at 18:00, on Zulip):

that's what I was trying to say

RalfJ (Jun 27 2019 at 18:00, on Zulip):

so a release-acquire store pair has basically the same effect as a release-relaxed-relaxed-aquire store-fence-fence-load quadruple

gnzlbg (Jun 27 2019 at 18:00, on Zulip):

but I see your point, in that the relaxed load and stores do not synchronize with anything, its the fences

RalfJ (Jun 27 2019 at 18:01, on Zulip):

but you have to use the same thing "on both sides"

gnzlbg (Jun 27 2019 at 18:01, on Zulip):

so can we use "atomic relaxed" as wording for the atomic volatile operations ?

gnzlbg (Jun 27 2019 at 18:01, on Zulip):

they don't synchronize with anything

RalfJ (Jun 27 2019 at 18:01, on Zulip):

(and in terms of efficiency, the fences are more likely to "accidentally" synchronize with other things as they are less specific... but then you get better control, like in my example where you get an acquire fence only after the successful read)

gnzlbg (Jun 27 2019 at 18:01, on Zulip):

you would need two fences, and then its up to the fences

RalfJ (Jun 27 2019 at 18:01, on Zulip):

no the fence son their own dont do anything

RalfJ (Jun 27 2019 at 18:02, on Zulip):

its the release-relaxed-relaxed-aquire fence-store-load-fence quadruple that does the synchronization

RalfJ (Jun 27 2019 at 18:02, on Zulip):

and we dont want to include volatile accesses in those

RalfJ (Jun 27 2019 at 18:02, on Zulip):

that's why we cannot call them "relaxed"

gnzlbg (Jun 27 2019 at 18:02, on Zulip):

i thought the fences synchronize, and this determines which stores are seen by which loads, etc.

RalfJ (Jun 27 2019 at 18:03, on Zulip):

there's no way to define what it means for two fences to synchronize, in the axiomatic model

RalfJ (Jun 27 2019 at 18:03, on Zulip):

synchronization is ultimately always seeded by a reads-from relationship -- some thread read a thing and thereby observed some other thread's write

RalfJ (Jun 27 2019 at 18:03, on Zulip):

once that happens, sometimes this induces a "synchronizes-with" (which implies happens-before)

gnzlbg (Jun 27 2019 at 18:04, on Zulip):

so I always thought of this as the fences synchronize with each other, and that determines which stores happen before which loads

RalfJ (Jun 27 2019 at 18:04, on Zulip):

and that sometimes is

RalfJ (Jun 27 2019 at 18:05, on Zulip):

for the 2nd clause, the store and load still need to be at least relaxed

RalfJ (Jun 27 2019 at 18:05, on Zulip):

and I'd like to avoid including volatile there

gnzlbg (Jun 27 2019 at 18:05, on Zulip):

we have to, otherwise a volatile write can be observed by a relaxed read

RalfJ (Jun 27 2019 at 18:05, on Zulip):

what do you mean?

gnzlbg (Jun 27 2019 at 18:06, on Zulip):

for MMIO, it would mean that a non-volatile read was performed

RalfJ (Jun 27 2019 at 18:06, on Zulip):

so I always thought of this as the fences synchronize with each other, and that determines which stores happen before which loads

well doing the same thing with non-atomics is still a data race despite the fences, so... that's not quire it in C11 I'm afraid.

gnzlbg (Jun 27 2019 at 18:06, on Zulip):

as in, if you are doing MMIO, the reads and the writes must be volatile, or UB probably

RalfJ (Jun 27 2019 at 18:06, on Zulip):

no I mean what do we have to do and why can who observe what if we dont?^^

gnzlbg (Jun 27 2019 at 18:07, on Zulip):

i mean that if we say that a volatile store is atomic relaxed, then a non-volatile relaxed load in another thread would be ok, depending on fences

gnzlbg (Jun 27 2019 at 18:07, on Zulip):

but we probably want to make that "not ok", and require the load to be volatile as well

gnzlbg (Jun 27 2019 at 18:07, on Zulip):

and also prevent people from using volatile for intra-thread synchronization

RalfJ (Jun 27 2019 at 18:08, on Zulip):

yes. in fact I'd say even if the load is also volatile, it should still yield undef as this is a data race

gnzlbg (Jun 27 2019 at 18:08, on Zulip):

yes

RalfJ (Jun 27 2019 at 18:08, on Zulip):

the exemption from data races is just for UB-ness, not for whether they happen

RalfJ (Jun 27 2019 at 18:08, on Zulip):

the docs should probably say that^^

gnzlbg (Jun 27 2019 at 18:08, on Zulip):

well... if we use freeze, undef will never be returned

RalfJ (Jun 27 2019 at 18:08, on Zulip):

yes but that's still very different from the normal atomic relaxed

RalfJ (Jun 27 2019 at 18:08, on Zulip):

which returns the old or the new value

gnzlbg (Jun 27 2019 at 18:09, on Zulip):

well its not atomic

gnzlbg (Jun 27 2019 at 18:09, on Zulip):

so you are not guaranted either the old or the new, you can get something in between

RalfJ (Jun 27 2019 at 18:09, on Zulip):

if its relaxed its atomic, otherwise the term makes no sense^^

RalfJ (Jun 27 2019 at 18:09, on Zulip):

so you are not guaranted either the old or the new, you can get something in between

you can get anything. I've been told about weird compiler optimizations where you can even see values that never were there

RalfJ (Jun 27 2019 at 18:09, on Zulip):

this can occur due to reorderings

gnzlbg (Jun 27 2019 at 18:10, on Zulip):

well volatile cannot be reordered around volatile

RalfJ (Jun 27 2019 at 18:10, on Zulip):

they were never there in the source program but reorderdings make them "be there" in the assembly

RalfJ (Jun 27 2019 at 18:10, on Zulip):

but the non-atomic write from which this volatile read reads is subject to all optimizations

RalfJ (Jun 27 2019 at 18:10, on Zulip):

so we cant guarantee that you wlll only read a combination of the values you have written

RalfJ (Jun 27 2019 at 18:11, on Zulip):

maybe if all accesses to this location are volatile... but why would we want a special case for that?

gnzlbg (Jun 27 2019 at 18:11, on Zulip):

well C kind of does

gnzlbg (Jun 27 2019 at 18:11, on Zulip):

the data is volatile, not the accesses

RalfJ (Jun 27 2019 at 18:11, on Zulip):

some of the proposals that are linked are suggesting to change that, IIRC

RalfJ (Jun 27 2019 at 18:11, on Zulip):

also LLVM doesnt work that way

RalfJ (Jun 27 2019 at 18:12, on Zulip):

and IMO LLVM's model is better

RalfJ (Jun 27 2019 at 18:12, on Zulip):

also "de-facto C" has a notion of volatile accesses and the Linux kernel and many more programs rely on that to work

RalfJ (Jun 27 2019 at 18:12, on Zulip):

the C standard basically just ignores reality here

gnzlbg (Jun 27 2019 at 18:13, on Zulip):

so are volatile reads/writes to the same memory from different threads without synchronization UB ?

gnzlbg (Jun 27 2019 at 18:13, on Zulip):

i feel am back to square 1 :D

RalfJ (Jun 27 2019 at 18:13, on Zulip):

so, anyone, coming back to your proposal... (a) please avoid the term "relaxed", and (b) when you say there are no data races, maybe it's better to say that data races are not UB but return initialized data?

RalfJ (Jun 27 2019 at 18:14, on Zulip):

that's for volatile reads

RalfJ (Jun 27 2019 at 18:14, on Zulip):

what to do about stores... not sure. write-write races are UB in LLVM, so if we want two concurrent volatile writes to the same location NOT UB, we'd have to ask LLVM first to give us better guarantees.

RalfJ (Jun 27 2019 at 18:15, on Zulip):

so are volatile reads/writes to the same memory from different threads without synchronization UB ?

the read vs write thing makes a difference here ;) if both are writes, dunno (yes if you ask the LLVM LangRef). otherwise (if at least one is a read), no.

gnzlbg (Jun 27 2019 at 18:17, on Zulip):

is that true for volatile writes as well?

RalfJ (Jun 27 2019 at 18:17, on Zulip):

I have not seen an exception for volatile writes

RalfJ (Jun 27 2019 at 18:18, on Zulip):

nor for volatile reads, btw -- in LLVM, reads never cause UB due to data races

RalfJ (Jun 27 2019 at 18:18, on Zulip):

they just return undef

gnzlbg (Jun 28 2019 at 11:06, on Zulip):

so I was wondering if we should say that volatile load / stores have side-effects / are unknown functions at all

gnzlbg (Jun 28 2019 at 11:08, on Zulip):
let a: ptr;
let b: ptr;
assert_ne!(a as usize, b as usize);
let x = a.read_volatile();
let y = b.read();

LLVM is allowed to re-order the b.read() before the a.read_volatile() AFAICT

gnzlbg (Jun 28 2019 at 11:08, on Zulip):

if read_volatile was an unknown function or had unknown side-effects, this isn't possible, because those side-effects could modify the memory at b (and the re-ordering would change program semantics).

gnzlbg (Jun 28 2019 at 11:09, on Zulip):

so what we really want to say is that the compiler must emit volatile loads and stores, and that these cannot be re-ordered across other volatile loads/stores

gnzlbg (Jun 28 2019 at 11:10, on Zulip):

and just leave it as that

gnzlbg (Jun 28 2019 at 11:45, on Zulip):

I've updated the gist with write operations: https://gist.github.com/gnzlbg/c8ef62c12e692a420face245c5df7123

RalfJ (Jun 28 2019 at 16:27, on Zulip):

if read_volatile was an unknown function or had unknown side-effects, this isn't possible, because those side-effects could modify the memory at b (and the re-ordering would change program semantics).

that's why I wrote, when I specified volatile, that they are unknown function calls that LLVM may make assumptions about. namely... <finding my own post to copy-paste>

RalfJ (Jun 28 2019 at 16:27, on Zulip):

"doesn't mutate any memory I know about that is not aliased with x..x+size"

("I" = the compiler)

RalfJ (Jun 28 2019 at 16:28, on Zulip):

probably even stronger, doesn't access that memory

RalfJ (Jun 28 2019 at 16:28, on Zulip):

but if you want to avoid that, please still talk about them being externally observable events

RalfJ (Jun 28 2019 at 16:28, on Zulip):

such that they cannot be reordered wrt other externally obsevrable events -- in particular, other volatile accesses

RalfJ (Jun 28 2019 at 16:29, on Zulip):

but also syscalls

Last update: Nov 20 2019 at 11:50UTC