Stream: t-lang/wg-unsafe-code-guidelines

Topic: AtomicCell and padding


RalfJ (Jun 18 2019 at 17:19, on Zulip):

AtomicCell<T> could use ptr::write_fields to ensure that the padding bytes of the T stored inside it is always zero (or some other value)

I don't see how that would work. After moving the AtomicCell, padding would be bad again.

gnzlbg (Jun 18 2019 at 17:30, on Zulip):

@RalfJ as mentioned, the AtomicCell does not store a T, it stores a usize, or [u8; 16]. T is only used in the API with its users.

RalfJ (Jun 18 2019 at 17:31, on Zulip):

@gnzlbg but it's supposed to work for any T

RalfJ (Jun 18 2019 at 17:32, on Zulip):

so at the least you'd need enogh const generics to make it [u8; mem::size_of::<T>()]

gnzlbg (Jun 18 2019 at 17:32, on Zulip):

how can it be atomic for any T ?

RalfJ (Jun 18 2019 at 17:32, on Zulip):

once it becomes too large it switches to seqlocks

gnzlbg (Jun 18 2019 at 17:32, on Zulip):

I thought it only works for T that fit in actual hardware atomics and then it switches to locks

RalfJ (Jun 18 2019 at 17:32, on Zulip):

but size of AtomicCell<T> must be the same as T, not wasting anything for small types and scaling to any type

gnzlbg (Jun 18 2019 at 17:33, on Zulip):

so if it switches to locks, the compare_and_exchange function does something completely different

RalfJ (Jun 18 2019 at 17:33, on Zulip):

maybe have a look at the code :) https://github.com/crossbeam-rs/crossbeam/blob/master/crossbeam-utils/src/atomic/atomic_cell.rs

RalfJ (Jun 18 2019 at 17:33, on Zulip):

yes but it still does something wrong because it compares undef

gnzlbg (Jun 18 2019 at 17:34, on Zulip):

Does it do a byte-wise comparison for the case in which T is too large ?

RalfJ (Jun 18 2019 at 17:34, on Zulip):

but yes the underlying compare_exchange is only used for small types

gnzlbg (Jun 18 2019 at 17:34, on Zulip):

I thought it would just call PartialEq or whatever there

RalfJ (Jun 18 2019 at 17:34, on Zulip):

it also does the bytewise thing, I forgot why

gnzlbg (Jun 18 2019 at 17:35, on Zulip):

no idea why that would be necessary, if you take a lock, you don't need to use a hardware CAS, and that also means you can just call PartialEq within the lock

RalfJ (Jun 18 2019 at 17:36, on Zulip):

might be just an artifact of how they set up their framework

gnzlbg (Jun 18 2019 at 17:36, on Zulip):

with hardware CAS (for small types), you don't have a choice, you need to do a byte equality, and also you need to compare packing

gnzlbg (Jun 18 2019 at 17:36, on Zulip):

right now, it is impossible to avoid passing undef to the hardware CAS, because you don't have a generic way to "write" to the padding bytes of a type

RalfJ (Jun 18 2019 at 17:37, on Zulip):

well a failing compare_exchange should also not do any writes

RalfJ (Jun 18 2019 at 17:37, on Zulip):

if it could, we could just have an "atomic" variant of freeze and call it a day

gnzlbg (Jun 18 2019 at 17:38, on Zulip):

it wouldn't do any writes

RalfJ (Jun 18 2019 at 17:38, on Zulip):

that's what I said earlier today about data race issues

RalfJ (Jun 18 2019 at 17:38, on Zulip):

ah so you want to use [u8; size_of::<T>()] to protect mvoes of AtomicCell and then make all write methods it offers ensure there is no undef

gnzlbg (Jun 18 2019 at 17:38, on Zulip):

When you construct an AtomicCell<T>, by calling AtomicCell::new(T), if T is small enough, you zero the AtomicCell storage, and only overwrite with the content of the input the bytes that are not packing via a ptr::write_fields

RalfJ (Jun 18 2019 at 17:39, on Zulip):

no need for a new operation then though. just use freeze on the value after you turned it into a byte array and before you stored it.

gnzlbg (Jun 18 2019 at 17:39, on Zulip):

the problem with freeze, is that you are not guaranteed to get the same indeterminate value every time

gnzlbg (Jun 18 2019 at 17:39, on Zulip):

IIUC

gnzlbg (Jun 18 2019 at 17:39, on Zulip):

while being able to write 0 to the padding bytes gives you that

RalfJ (Jun 18 2019 at 17:39, on Zulip):

that should no matter. that's why atomic_exchange runs a loop anways.

RalfJ (Jun 18 2019 at 17:40, on Zulip):

the issue with your proposal is figuring out where the padding bytes are^^ have fun doing that in a union...

gnzlbg (Jun 18 2019 at 17:40, on Zulip):

in AtomicCell::compare_exchange(T) you copy the input to the stack, zeroing padding, and then do the hardware CAS, and you are guaranteed that the padding bytes of the input and the value stored in the AtomicCell are the same

gnzlbg (Jun 18 2019 at 17:41, on Zulip):

the issue with your proposal is figuring out where the padding bytes are^^ have fun doing that in a union

unions have no padding AFAICT

gnzlbg (Jun 18 2019 at 17:42, on Zulip):

ah no, you are right, ofc they have

gnzlbg (Jun 18 2019 at 17:42, on Zulip):

but from the AtomicCell<Union> POV, you can't know which variant of the union is active, so you have to do a bytewise equality on the whole union

gnzlbg (Jun 18 2019 at 17:42, on Zulip):

i don't know how that can ever work properly

RalfJ (Jun 18 2019 at 17:42, on Zulip):

in AtomicCell::compare_exchange(T) you copy the input to the stack, zeroing padding, and then do the hardware CAS, and you are guaranteed that the padding bytes of the input and the value stored in the AtomicCell are the same

what the implementation already does is, if the first CAS failed but the eq function said that the value was equal, to now continue with the value that was read the first time

RalfJ (Jun 18 2019 at 17:43, on Zulip):

and we freeze only once in the beginning

RalfJ (Jun 18 2019 at 17:43, on Zulip):

so even with padding, in the 2nd iteration we are definitely looking at the same bits

RalfJ (Jun 18 2019 at 17:43, on Zulip):

it needs that loop anyway to handle the case where things with different bits are eq

RalfJ (Jun 18 2019 at 17:44, on Zulip):

so this scheme I am proposing works fine, I think, if we can ensure that the data in memory has no Undef, and assuming a (non-atomic) freeze operation

gnzlbg (Jun 18 2019 at 17:44, on Zulip):

IIUC the data in memory can only have Undef at padding bytes

RalfJ (Jun 18 2019 at 17:46, on Zulip):

yes isn't that what we talked about for quite a while now? /confused

gnzlbg (Jun 18 2019 at 17:47, on Zulip):

yes, but I don't know how you can ensure that the date in memory has no undef

RalfJ (Jun 18 2019 at 17:48, on Zulip):

your own scheme relies on that...?

gnzlbg (Jun 18 2019 at 17:48, on Zulip):

at least, without either a way to either freeze that data, or write to its padding

RalfJ (Jun 18 2019 at 17:48, on Zulip):

I am so confused now

gnzlbg (Jun 18 2019 at 17:48, on Zulip):

ah ok, I thought you were suggesting that there was another way

RalfJ (Jun 18 2019 at 17:48, on Zulip):

we described two schemes, yours requires a non-atomic "zero-all-padding" operation and mine does not. both assume freeze and both have to maintain the invariant that the data in memory is not Undef

RalfJ (Jun 18 2019 at 17:49, on Zulip):

the reason I am saying this is that this invariant that we both need cannot be upheld if AtomicCell has a get_mut operation

gnzlbg (Jun 18 2019 at 17:49, on Zulip):

ah, ok, so you are describing your scheme where you freeze the data before putting it into the AtomicCell

RalfJ (Jun 18 2019 at 17:49, on Zulip):

ah, ok, so you are describing your scheme where you freeze the data before putting it into the AtomicCell

I am comparing that to yours, yes. you said there'd be problems because the padding might be different each time you freeze and I explained why that is not a problem. I am not sure if you already accepted that explanation.

gnzlbg (Jun 18 2019 at 17:50, on Zulip):

yes, the second iteration of the CAS fixes that, if the CAS is specified to propagate the undef frozen bits properly

RalfJ (Jun 18 2019 at 17:50, on Zulip):

FWIW, that loop that I described already exists, it's at https://github.com/crossbeam-rs/crossbeam/blob/43967df8182e1f1ce7baebda36361c933caba39b/crossbeam-utils/src/atomic/atomic_cell.rs#L925

RalfJ (Jun 18 2019 at 17:51, on Zulip):

CAS would not see any undef... ah yes

RalfJ (Jun 18 2019 at 17:51, on Zulip):

frozen bits are just bits

RalfJ (Jun 18 2019 at 17:51, on Zulip):

but all of this is broken by AtomicCell::get_mut :/

gnzlbg (Jun 18 2019 at 17:51, on Zulip):

what's the signature of that ?

RalfJ (Jun 18 2019 at 17:51, on Zulip):

&mut AtomicCell<T> -> &mut T

RalfJ (Jun 18 2019 at 17:52, on Zulip):

which they had, and then they removed it when I called this out, and then they added it again because the concern is too theoretic

gnzlbg (Jun 18 2019 at 17:52, on Zulip):

ah so that allows users to write undef into AtomiCell<T>

RalfJ (Jun 18 2019 at 17:52, on Zulip):

(well that's my reading of why they added it again anyway, it was a somewhat heated argument because they claimed it was actually sound... but that's besides the point)

RalfJ (Jun 18 2019 at 17:52, on Zulip):

exactly

gnzlbg (Jun 18 2019 at 17:52, on Zulip):

instead of going through the AtomicCell::new(T) constructor, which would freeze

RalfJ (Jun 18 2019 at 17:52, on Zulip):

by overwriting the content, re-undef-ing the padding

gnzlbg (Jun 18 2019 at 17:53, on Zulip):

yeah so that has to go, i don't think that can ever work

RalfJ (Jun 18 2019 at 17:53, on Zulip):

that's the current status yes. except that that seems entirely silly, this should be possible.

gnzlbg (Jun 18 2019 at 17:53, on Zulip):

unless they add a trait bound to AtomicCell<T> to prevent types with padding, which would be an even larger breaking change

RalfJ (Jun 18 2019 at 17:53, on Zulip):

oh and C++ thinks it is possible, but that doesn't mean anything of course^^

gnzlbg (Jun 18 2019 at 17:54, on Zulip):

i mean, if you have undef in the T in the atomic cell, your usize will have undef bytes

gnzlbg (Jun 18 2019 at 17:54, on Zulip):

so the CAS intrinsic needs to work on undef bits

RalfJ (Jun 18 2019 at 17:54, on Zulip):

https://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange is a delightful read (the "Notes" section in particular)

RalfJ (Jun 18 2019 at 17:54, on Zulip):

yes. that or we have an atomic variant of freeze that does "not count as a data race". whatever the heck that means.

gnzlbg (Jun 18 2019 at 17:55, on Zulip):

so i'd prefer to avoid specifying what undef == undef means in the "compare" part of CAS

RalfJ (Jun 18 2019 at 17:55, on Zulip):

well it can only be undef because we can't implement anything else^^

gnzlbg (Jun 18 2019 at 17:55, on Zulip):

sure, but they are doing bit equality

RalfJ (Jun 18 2019 at 17:56, on Zulip):

(or UB of course)

gnzlbg (Jun 18 2019 at 17:56, on Zulip):

so an atomic_freeze_and_cas intrinsic would also work, but..

gnzlbg (Jun 18 2019 at 17:58, on Zulip):

so that C++ section is clear

RalfJ (Jun 18 2019 at 17:59, on Zulip):

"clear?"^^

RalfJ (Jun 18 2019 at 18:00, on Zulip):

an operational semantics is clear

RalfJ (Jun 18 2019 at 18:00, on Zulip):

that's pretty far from one

gnzlbg (Jun 18 2019 at 18:00, on Zulip):

i mean, it says that the moment you have padding bits, things might not work, which is also what memcmp says

RalfJ (Jun 18 2019 at 18:01, on Zulip):

"might not work", lol

RalfJ (Jun 18 2019 at 18:01, on Zulip):

AFAIK memcmp is just UB on padding

gnzlbg (Jun 18 2019 at 18:01, on Zulip):

i think the same applies to CAS

RalfJ (Jun 18 2019 at 18:02, on Zulip):

well, in my reading of the standard it is (and CERT agrees https://en.cppreference.com/w/cpp/atomic/atomic/compare_exchange )

gnzlbg (Jun 18 2019 at 18:02, on Zulip):

at least in C++, from reading that, i would need to read the standard to be sure, but I don't think the standard specifies the behavior of equality of padding bytes

RalfJ (Jun 18 2019 at 18:02, on Zulip):

CAS dosn't sound like it'd be UB for all padding bytes, it seems to quitw specifically allow being used on structs with padding

RalfJ (Jun 18 2019 at 18:02, on Zulip):

and for unions it doesnt say UB either, it says the value might not converge

gnzlbg (Jun 18 2019 at 18:03, on Zulip):

letss see what the standard say, that tone is too loose to be standardese

gnzlbg (Jun 18 2019 at 18:03, on Zulip):

http://eel.is/c++draft/atomics.types.generic#atomics.types.operations-17

RalfJ (Jun 18 2019 at 18:04, on Zulip):

(magic, I tried to fix the topic)

RalfJ (Jun 18 2019 at 18:05, on Zulip):

that text sounds unimplementable. the value representation does not contain padding, right?

RalfJ (Jun 18 2019 at 18:05, on Zulip):

also http://eel.is/c++draft/atomics.types.generic#atomics.types.operations-20

RalfJ (Jun 18 2019 at 18:07, on Zulip):

and OMG http://eel.is/c++draft/atomics.types.generic#atomics.types.operations-23

gnzlbg (Jun 18 2019 at 18:07, on Zulip):

http://eel.is/c++draft/atomics.types.generic#atomics.types.operations-23

RalfJ (Jun 18 2019 at 18:07, on Zulip):

how are compilers supposed to implement this? a version of CAS that ignores padding?

gnzlbg (Jun 18 2019 at 18:08, on Zulip):

yeah, i was just going to say that

gnzlbg (Jun 18 2019 at 18:08, on Zulip):

no idea, I'll ask for clarification

RalfJ (Jun 18 2019 at 18:08, on Zulip):

of course for Rust it'd be even worse due to enums

gnzlbg (Jun 18 2019 at 18:08, on Zulip):

that basically says that padding is magically not compared by CAS

RalfJ (Jun 18 2019 at 18:08, on Zulip):

I agree

gnzlbg (Jun 18 2019 at 18:08, on Zulip):

i mean, if you have such a magic CAS, then that's obviously the thing to do, but... we don't have those :D

RalfJ (Jun 18 2019 at 18:08, on Zulip):

let's go ask Intel for a CAS with a bitmask :P

RalfJ (Jun 18 2019 at 18:09, on Zulip):

well the bitmask would be good enough for C but not Rust (and only because they hand-wave their way around union)

nagisa (Jun 18 2019 at 18:10, on Zulip):

@RalfJ by making it mutex-atomic.

RalfJ (Jun 18 2019 at 18:10, on Zulip):

well that would still require a way to compare skipping padding

RalfJ (Jun 18 2019 at 18:13, on Zulip):

@nagisa looks like a hardware CAS to me: https://godbolt.org/z/Zj6-Tb

RalfJ (Jun 18 2019 at 18:13, on Zulip):

but I also cant really read assembly^^

RalfJ (Jun 18 2019 at 18:14, on Zulip):

clang seems to call a __atomic_compare_exchange thing

gnzlbg (Jun 18 2019 at 18:20, on Zulip):

https://cpp.godbolt.org/z/pJX07d

gnzlbg (Jun 18 2019 at 18:21, on Zulip):

so GCC at least just compares padding and calls it a day, but from my understanding of the standard, it is not allowed to do that

gnzlbg (Jun 18 2019 at 18:29, on Zulip):

IIUC, the expectation is that C++'s atomic<T>::compare_exchange(T&, T, ..) makes sure that its T has the padding set to some fixed known value, and then on call, it copies its arguments to the stack, sets their padding to the same value, and then does the CAS

gnzlbg (Jun 18 2019 at 18:29, on Zulip):

atomic<T> does not let you take a reference to its internal T, so it doesn't have to worry about users breaking this invariant @RalfJ

gnzlbg (Jun 18 2019 at 18:41, on Zulip):

@RalfJ so that link above is from C++2a - in C++11, the result of compare_exchange is indeterminate if the type has padding bytes, and they essentially use two iterations like in Rust

gnzlbg (Jun 18 2019 at 18:58, on Zulip):

http://eel.is/c++draft/atomics.ref.generic#3

gnzlbg (Jun 18 2019 at 18:59, on Zulip):

std::atomic_ref also works around this, by saying that as long as a atomic_ref to a value is live, you can only access the value through an std::atomic_ref. This means that the atomic_ref can, on creation, write e.g. 0 to the padding of the value, and then the compare_exchange assumes that all padding is zero

centril (Jun 18 2019 at 21:32, on Zulip):

https://github.com/crossbeam-rs/crossbeam/pull/372 being accepted feels pretty alarming for such an important crate...

Vadim Petrochenkov (Jun 19 2019 at 07:35, on Zulip):

that basically says that padding is magically not compared by CAS

Wait a second, I'm pretty sure I've seen a defect report about that, and it should already be merged into the standard (at least into C++20).

Vadim Petrochenkov (Jun 19 2019 at 07:35, on Zulip):

Let me check.

Vadim Petrochenkov (Jun 19 2019 at 07:38, on Zulip):

Ok, apparently the change is there, but it does a different thing than I remembered.

Vadim Petrochenkov (Jun 19 2019 at 07:42, on Zulip):

The paper - http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0528r3.html.
(Previous revisions contain some more discussion, not just wording.)

RalfJ (Jun 19 2019 at 10:09, on Zulip):

@Vadim Petrochenkov do you have links to those previous revisions?

RalfJ (Jun 19 2019 at 10:11, on Zulip):

but that change sounds like exactly the thing that mandates that padding values be magically ignored

RalfJ (Jun 19 2019 at 10:11, on Zulip):

and I don't see how GCC is doing that

RalfJ (Jun 19 2019 at 10:11, on Zulip):

(and for clang I have no idea what __atomic_compare_exchange does)

RalfJ (Jun 19 2019 at 10:12, on Zulip):

std::atomic_ref also works around this, by saying that as long as a atomic_ref to a value is live, you can only access the value through an std::atomic_ref. This means that the atomic_ref can, on creation, write e.g. 0 to the padding of the value, and then the compare_exchange assumes that all padding is zero

that is interesting. they basically have the "opposite" of our model here -- start "non-atomic" and get a magic "atomic ref" if you also want atomic accesses. in Rust it's the other way around.

RalfJ (Jun 19 2019 at 10:13, on Zulip):

also this excludes things that are otherwise fine in the memory model, such as having a racing atomic and non-atomic access.

Vadim Petrochenkov (Jun 19 2019 at 10:21, on Zulip):

@RalfJ
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0528r2.html
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0528r1.html
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0528r0.html

RalfJ (Jun 19 2019 at 11:20, on Zulip):

thanks!

gnzlbg (Jun 20 2019 at 14:38, on Zulip):

@RalfJ

and I don't see how GCC is doing that
(and for clang I have no idea what __atomic_compare_exchange does)

They don't implement the fixed semantics yet.

gnzlbg (Jun 20 2019 at 14:38, on Zulip):

They can implement them by zeroing padding before doing the CAS.

gnzlbg (Jun 20 2019 at 14:39, on Zulip):

That is, when you do an atomic<T>::compare_exchange, the padding of all objects involved is always zeroed. While the CAS compares the all bits, the comparison for the padding then does not affect the result.

gnzlbg (Jun 20 2019 at 14:48, on Zulip):

This is achieved via the atomic<T> and atomic_ref<T> semantics.

atomic<T> is constructed from a T, and owns it - you can only modify its T via the API, there is no get_mut. That is, on construction, the padding is zeroed. When doing atomic<T>::compare_exchange(T&, T, ..) that function moves everything into its stack, zeros the padding, does the CAS, and then writes through the T& back into the user stack frame.

atomic_ref<T> is a T*, but the moment you construct it, you can only modify the T it points to via atomic_ref<T> objects (not necessarily the one you created, you can create more,). On construction, atomic_ref<T> does an atomic load of T, checks if the padding is zeroed, and if so, we are done. Otherwise, it needs to zero the padding, and do an atomic CAS loop, until it can zero the padding of the object it refers to. Once you have your atomic_ref<T> constructed, you know that it points to a T whose padding have been zeroed, and that you can only access it via atomic_ref<T> objects until the last one dies. So on all other operations, like for atomic<T>, you assume that the padding has been zeroed.

RalfJ (Jun 20 2019 at 14:50, on Zulip):

On construction, atomic_ref<T> does an atomic load of T, checks if the padding is zeroed, and if so, we are done. Otherwise, it needs to zero the padding, and do an atomic CAS loop, until it can zero the padding of the object it refers to.

This means that constructing an atomic_ref in parallel with doing a non-atomic read of the same location is UB?

gnzlbg (Jun 20 2019 at 14:51, on Zulip):

that sounds like a data-race, so probably yes

gnzlbg (Jun 20 2019 at 15:23, on Zulip):

also, depending on the platform, the constructor does not need a CAS loop, e.g., it can just use atomic writes to each padding byte of the pointee, how those overlap with, e.g., larger atomic loads, is platform-dependent

Last update: Nov 19 2019 at 18:10UTC