Stream: t-lang/wg-unsafe-code-guidelines

Topic: volatile, atomics and mmio


Matt Taylor (Jul 25 2019 at 20:27, on Zulip):

for embedded sometimes you need a non-tearing non-coalescing store, is there a way to do this in rust currently? volatile can be tearing afaik, and atomics can be coalesced so i’m not sure how to proceed

Matt Taylor (Jul 25 2019 at 20:28, on Zulip):

specifically for memory-mapped I/O, e.g if you’re programming the LAPIC on x86 you need such a store

RalfJ (Jul 25 2019 at 20:30, on Zulip):

no there is not -- the intended way is to use volatile, but we need a way to express when those can tear and when not

RalfJ (Jul 25 2019 at 20:32, on Zulip):

the two proposals I have seen for that are to either expose some kind of intrinsic or trait that lets you test if volatile accesses for type T can tear, or alternatively @gnzlbg proposed to deprecate read/write_volatile in favor of a whole bunch of non-tearing intrinsics of various sizes (similar to how we have atomic intrinsics of various sizes)

RalfJ (Jul 25 2019 at 20:32, on Zulip):

I think this has stalled mostly because nobody pushed for it

Matt Taylor (Jul 25 2019 at 20:33, on Zulip):

i see. seems like something that a lot of embedded people would actually need in theory, although in practice i guess volatile will be fine (if it’s aligned)

RalfJ (Jul 25 2019 at 20:34, on Zulip):

in practice "small" aligned volatile accesses (unaligned ones are not even possible on stable Rust) will not tear

RalfJ (Jul 25 2019 at 20:35, on Zulip):

but I dont know what "small" is. I expect usize to be "small".

Lokathor (Jul 25 2019 at 23:58, on Zulip):

if you do anything at all mmio you need to check your hardware manual to read the mmio section, so it's not a huge deal that you also have to check the instruction set to see what the allowed read and write sizes are. then you'll know which volatile actions are non-tearing

Matt Taylor (Jul 26 2019 at 12:13, on Zulip):

the problem, afaik, is more that you can tell the compiler to emit a 64-bit write and it just decides to emit two 32-bit writes for whatever reason, even if 64-bit is the native word size

Matt Taylor (Jul 26 2019 at 12:14, on Zulip):

although in practice i can’t think of a reason why llvm would do that

gnzlbg (Jul 26 2019 at 13:05, on Zulip):

@Matt Taylor I suppose that depends on how you tell the compiler

gnzlbg (Jul 26 2019 at 13:05, on Zulip):

if you tell it via a *ptr or ptr.write or similar generic methods that will work for all sizes, the compiler might do whatever, from emitting a single instruction, to two instructions, or even calling memcpy

gnzlbg (Jul 26 2019 at 13:06, on Zulip):

if you want the compiler to use a particular instruction to perform a write, AFAICT your only option right now is to use inline assembly

Matt Taylor (Jul 26 2019 at 13:06, on Zulip):

yeah, exactly

gnzlbg (Jul 26 2019 at 13:07, on Zulip):

you can do a bit better with atomics

gnzlbg (Jul 26 2019 at 13:07, on Zulip):

but if you want to do mmio you probably need volatile as well

gnzlbg (Jul 26 2019 at 13:07, on Zulip):

otherwise the write might be optimized in subtle ways

Matt Taylor (Jul 26 2019 at 13:08, on Zulip):

i think in c++ you could use something like volatile std::atomic<T> but since the volatile ops are functions, can’t be done in rust

Matt Taylor (Jul 26 2019 at 13:09, on Zulip):

although it might be worth mentioning that the proposal for the new volatile_load / volatile_store in c++ do guarantee non-tearing if it’s available

gnzlbg (Jul 26 2019 at 13:09, on Zulip):

Are there volatile-qualified overloads of the std::atomic methods in C++ ?

gnzlbg (Jul 26 2019 at 13:09, on Zulip):

we could add volatile_load/volatile_store methods to the atomic types in Rust

gnzlbg (Jul 26 2019 at 13:09, on Zulip):

but the "non-tearing if its available" is not enough if you require non-tearing for correctness

Matt Taylor (Jul 26 2019 at 13:10, on Zulip):

you’d also need a way to test for it i guess

gnzlbg (Jul 26 2019 at 13:10, on Zulip):

or get a compiler error

Matt Taylor (Jul 26 2019 at 13:11, on Zulip):

how would you even tell llvm to emit a non-tearing volatile access

gnzlbg (Jul 26 2019 at 13:11, on Zulip):

right now, the ptr.write_volatile methods are non-tearing, if possible

gnzlbg (Jul 26 2019 at 13:11, on Zulip):

because llvm won't tear for fun

gnzlbg (Jul 26 2019 at 13:11, on Zulip):

@Matt Taylor you use a volatile+atomic load/store

Matt Taylor (Jul 26 2019 at 13:12, on Zulip):

ok i see

gnzlbg (Jul 26 2019 at 13:12, on Zulip):

atomic load/stores are non-tearing (they are "atomic"), and you just need to also prevent llvm from optimizing them in certain ways, which you can do by making them volatile

Matt Taylor (Jul 26 2019 at 13:12, on Zulip):

i mean the closest reason i can think to why it might tear is if you have something like a mul which can spit out a 64-bit operand in two regs

gnzlbg (Jul 26 2019 at 13:12, on Zulip):

unaligned load/stores often tear

Matt Taylor (Jul 26 2019 at 13:12, on Zulip):

but as ralf said those are curently not allowed anyway

gnzlbg (Jul 26 2019 at 13:12, on Zulip):

e.g. if the memory goes across a cache line or page boundary

gnzlbg (Jul 26 2019 at 13:13, on Zulip):

yep I don't think we have any API for those

Matt Taylor (Jul 26 2019 at 13:13, on Zulip):

i mean does llvm actually provide any documented guarantees that it will not tear?

gnzlbg (Jul 26 2019 at 13:13, on Zulip):

for atomics? yes

Matt Taylor (Jul 26 2019 at 13:14, on Zulip):

for non-atomics

Matt Taylor (Jul 26 2019 at 13:14, on Zulip):

you said ptr.write_volatile will not tear ‘if possible’

gnzlbg (Jul 26 2019 at 13:14, on Zulip):

no

gnzlbg (Jul 26 2019 at 13:14, on Zulip):

"if possible" is not a guarantee

Matt Taylor (Jul 26 2019 at 13:15, on Zulip):

fair enough

gnzlbg (Jul 26 2019 at 13:15, on Zulip):

the difference is that atomics will fail to compile if the target doesn't support a particular size

gnzlbg (Jul 26 2019 at 13:15, on Zulip):

so LLVM knows that there is an instruction for doing the thing correctly

gnzlbg (Jul 26 2019 at 13:16, on Zulip):

I think we could add volatile_load/volatile_store methods to the atomic types, and that might work for you - if the target supports that atomic type that is

Matt Taylor (Jul 26 2019 at 13:16, on Zulip):

i think that would be a pretty good way to proceed

Matt Taylor (Jul 26 2019 at 13:39, on Zulip):

although since MMIO is usually done through hardcoded addresses, it could be a bit weird to have a pointer to an atomic type at that location, no?

gnzlbg (Jul 26 2019 at 14:05, on Zulip):

FWIW I never understood why we need AtomicXY types, instead of having ptr.atomic_load(...) methods.

gnzlbg (Jul 26 2019 at 14:06, on Zulip):

Sure an atomic type wrapper can be built on top of the pointer methods, but the opposite is not true I think.

If you want to do an atomic relaxed load of some memory address in rust today, AFAICT, there is no easy way for you to do that

gnzlbg (Jul 26 2019 at 14:06, on Zulip):

You would need to somehow construct an AtomicXY at that address

gnzlbg (Jul 26 2019 at 14:07, on Zulip):

Well not construct, but go from the address as an usize to a &AtomicXY that you can then operate on - doable, but weird

nagisa (Jul 26 2019 at 14:28, on Zulip):

*mut AtomicT is the oposite you’re looking for.

nagisa (Jul 26 2019 at 14:28, on Zulip):

AtomicT is guaranteed to be layout-compatible to T so you can just cast the pointer.

Matt Taylor (Jul 26 2019 at 14:48, on Zulip):

it’s just a bit trickier than having a method on the pointer itself. also i think you can still swap / replace the AtomicT non-atomically, which is a bit odd

Matt Taylor (Jul 26 2019 at 14:50, on Zulip):

also you will have to go around casting your types to integers (or bool) in order to write them out with this design

simulacrum (Jul 26 2019 at 14:52, on Zulip):

You can write an extension trait presumably that casts under the hood to the appropriate atomic and calls the appropriate function

Matt Taylor (Jul 26 2019 at 14:54, on Zulip):

oh and there might be another issue - the methods for these would probably require taking &self, but to my knowledge references must be dereferenceable at all times, and if you’re accessing some memory region that’s writable but not readable then it would be invalid to create a reference to this address

Matt Taylor (Jul 26 2019 at 14:55, on Zulip):

i can’t think of any instance where this is actually the case in practice (writable but not readable mem), but there probably are cases

simulacrum (Jul 26 2019 at 14:59, on Zulip):

I'm not sure why/how that would be a problem? I'm envisioning the methods on the trait taking self by-value so there's no references to speak of here

simulacrum (Jul 26 2019 at 14:59, on Zulip):

We might want AtomicT and such to be defined on *const T instead of &self, though

simulacrum (Jul 26 2019 at 14:59, on Zulip):

and/or *mut T

simulacrum (Jul 26 2019 at 15:00, on Zulip):

I think there might be an open issue about that

Matt Taylor (Jul 26 2019 at 15:00, on Zulip):

@simulacrum sorry, i was talking about fn volatile_load(&self, ...) method on the AtomicT itself

Matt Taylor (Jul 26 2019 at 15:00, on Zulip):

what gnzlbg proposed

simulacrum (Jul 26 2019 at 15:01, on Zulip):

Yeah, we'd probably want that to be fn volatile_load(*const Self, ...)

simulacrum (Jul 26 2019 at 15:01, on Zulip):

then there should be no problem

simulacrum (Jul 26 2019 at 15:01, on Zulip):

(though I guess we'd need that to be unsafe fn)

Matt Taylor (Jul 26 2019 at 15:01, on Zulip):

ok cool

Matt Taylor (Jul 26 2019 at 15:02, on Zulip):

does sound a bit tricky overall but i can’t think of a better way and it is niche anyway

Matt Taylor (Jul 26 2019 at 15:03, on Zulip):

i guess this would need an RFC? i’m down to draft something up, but OTOH if people want to make other changes to ptr::volatile_load and such as ralf implied, then it may be better to do it all at once

simulacrum (Jul 26 2019 at 15:06, on Zulip):

hm, not sure. I would lean towards yes since it deals with UCG stuff

Tom Phinney (Jul 26 2019 at 15:48, on Zulip):

i can’t think of any instance where this is actually the case in practice (writable but not readable mem), but there probably are cases

Some embedded microcontrollers have memory-mapped I/O where a read-after-write returns a different value for the read than what was just written (e.g., writing bits that trigger actions, returning instead bits that reflect current state).

Matt Taylor (Jul 26 2019 at 16:02, on Zulip):

i am thinking more of cases where an unnecessary read triggers an actual action, like crashing. i’m told that for the pci ide controller a read to some address indicates that you’ve handled an IRQ (presumably allowing it to send another one), which can be bad

Matt Taylor (Jul 26 2019 at 16:04, on Zulip):

and again, i see absolutely no reason why this would even happen in practice, but if we’re being pedantic about non-tearing accesses it seems reasonable to be pedantic about this too

Tom Phinney (Jul 26 2019 at 16:24, on Zulip):

for the pci ide controller a read to some address indicates that you’ve handled an IRQ (presumably allowing it to send another one)

I've encountered that situation in memory-mapped I/O. It tends to show up more frequently in 8/16-bit data, 16/24-bit usize microcontrollers, which Rust currently doesn't support (because the current minimum usize is 32). IMO this "feature" is often a hold-over from those earlier designs for small-bit-width word sizes. However, in cases like IRQ it may also be a timing optimization for performance-critical code that runs in a protected state (such as with interrupts disabled) whose duration needs to be minimized.

RalfJ (Jul 26 2019 at 16:29, on Zulip):

@gnzlbg

atomic load/stores are non-tearing (they are "atomic"), and you just need to also prevent llvm from optimizing them in certain ways, which you can do by making them volatile

atomic has two aspects, non-tearing and the Ordering thing. So I don't think you want "atomic volatile", you just want "non-tearing volatile".

Matt Taylor (Jul 26 2019 at 16:31, on Zulip):

what is the difference between a Relaxed volatile atomic access and a non-tearing volatile access? @RalfJ

RalfJ (Jul 26 2019 at 16:31, on Zulip):

@Matt Taylor also for your questions around volatile and MMIO, some further reading material:
- https://internals.rust-lang.org/t/volatile-and-sensitive-memory/3188/49?u=ralfjung
- https://github.com/rust-lang/unsafe-code-guidelines/issues/33
- https://github.com/rust-lang/unsafe-code-guidelines/issues/152

RalfJ (Jul 26 2019 at 16:32, on Zulip):

what is the difference between a Relaxed volatile atomic access and a non-tearing volatile access? RalfJ

the difference is in whether it is UB for such an access to be in a data race with other accesses

RalfJ (Jul 26 2019 at 16:32, on Zulip):

and there's also a difference in whether a happens-before relationship can be established when this access is combined the right way with a fence

RalfJ (Jul 26 2019 at 16:33, on Zulip):

I had a long discussion with @gnzlbg about that once, but I am not sure where... we should find some place to write down the conclusions of that discussion

gnzlbg (Jul 26 2019 at 16:34, on Zulip):

atomic has two aspects, non-tearing and the Ordering thing. So I don't think you want "atomic volatile", you just want "non-tearing volatile".

Wasn't one of the ordering for atomics in LLVM Unordered ? (I'm not sure how that differed from Relaxed since Relaxed is defined as "no ordering")

RalfJ (Jul 26 2019 at 16:34, on Zulip):

@Matt Taylor I am not sure how familiar you are with the C/C++ concurrency memory model(s)?

RalfJ (Jul 26 2019 at 16:35, on Zulip):

@gnzlbg yeah LLVM has an "even weaker than Relaxed" thing. IIRC it does not even guarantee coherence (i.e. it can first see a new write and then an old write). no idea how that relates to volatile or "normal" accesses though.

RalfJ (Jul 26 2019 at 16:36, on Zulip):

@Matt Taylor ah here is the public part of that discussion: https://internals.rust-lang.org/t/add-volatile-operations-to-core-x86-64/10480

RalfJ (Jul 26 2019 at 16:36, on Zulip):

but there's also a long private chat thread between @gnzlbg and me

gnzlbg (Jul 26 2019 at 16:39, on Zulip):

I don't remember what the value was in doing something less than relaxed

RalfJ (Jul 26 2019 at 16:39, on Zulip):

relaxed->fence->fence->relaxed can imply happens-before

gnzlbg (Jul 26 2019 at 16:39, on Zulip):

Sure, using relaxed allows for some synchronization if the proper fences and other atomic operations are used

RalfJ (Jul 26 2019 at 16:39, on Zulip):

which I dont think we want for volatile

gnzlbg (Jul 26 2019 at 16:40, on Zulip):

i mean, it can, if one uses the fences

gnzlbg (Jul 26 2019 at 16:40, on Zulip):

but then one wanted synchronization of some sort I guess, otherwise why use the fences

RalfJ (Jul 26 2019 at 16:40, on Zulip):

yes. but the key thing is if you replace Relaxed by a "normal" access, it does not imply anything, even with the fences

gnzlbg (Jul 26 2019 at 16:41, on Zulip):

yes that's right, a volatile atomic relaxed is not the same as a normal volatile load / store because of the fence interaction

RalfJ (Jul 26 2019 at 16:41, on Zulip):

so without a very careful study I'd argue it should also not imply anything for volatile accesses

RalfJ (Jul 26 2019 at 16:41, on Zulip):

maybe Unordered is enough for that, not sure

gnzlbg (Jul 26 2019 at 16:42, on Zulip):

https://llvm.org/docs/Atomics.html#unordered

gnzlbg (Jul 26 2019 at 16:44, on Zulip):

"These operations are required to be atomic in the sense that if you use unordered loads and unordered stores, a load cannot see a value which was never stored. "

RalfJ (Jul 26 2019 at 16:44, on Zulip):

"This cannot be used for synchronization"

RalfJ (Jul 26 2019 at 16:44, on Zulip):

seems reasonable

gnzlbg (Jul 26 2019 at 16:44, on Zulip):

"can be expensive or unavailable for wider loads"

gnzlbg (Jul 26 2019 at 16:45, on Zulip):

It appears that it fails to compile if there isn't a native instruction that performs the operation atomically.

RalfJ (Jul 26 2019 at 16:45, on Zulip):

probably volatile already acts like unordered, except for the tearing aspect

gnzlbg (Jul 26 2019 at 16:45, on Zulip):

"an unordered load or store cannot be split into multiple instructions "

RalfJ (Jul 26 2019 at 16:45, on Zulip):

but based on the feedback we got so far, it seems unlikely LLVM will want to guarantee that to us :/

gnzlbg (Jul 26 2019 at 16:45, on Zulip):

FWIW on IRC they did recommend looking into atomic unordered instead if we wanted guaranteed no tearing

RalfJ (Jul 26 2019 at 16:46, on Zulip):

we want no tearing and volatile though

gnzlbg (Jul 26 2019 at 16:46, on Zulip):

but then these operations can be marked with volatile

RalfJ (Jul 26 2019 at 16:46, on Zulip):

unordered can be dead-read-eliminated, for example

RalfJ (Jul 26 2019 at 16:46, on Zulip):

LLVM lets you combine atomic orderings with volatile?

gnzlbg (Jul 26 2019 at 16:46, on Zulip):

yes it would be volatile+unordered

gnzlbg (Jul 26 2019 at 16:46, on Zulip):

yes it does, for all atomic orderings!

RalfJ (Jul 26 2019 at 16:46, on Zulip):

oh wow

RalfJ (Jul 26 2019 at 16:47, on Zulip):

whatever the heck that means in terms of semantics^^

gnzlbg (Jul 26 2019 at 16:47, on Zulip):

so that's what I meant that volatile+relaxed might have done the trick

gnzlbg (Jul 26 2019 at 16:47, on Zulip):

even though it might be a bit footguny if it can imply a happens before

RalfJ (Jul 26 2019 at 16:47, on Zulip):

I think if you s/Relaxed/Unordered/, I can live with that

gnzlbg (Jul 26 2019 at 16:48, on Zulip):

we could add unstable volatile_unordered_load/store methods to the Atomic types and see if those solve the problems that people have

gnzlbg (Jul 26 2019 at 16:48, on Zulip):

most are already doing what works for them, whatever that might be

Matt Taylor (Jul 26 2019 at 16:49, on Zulip):

wow, this is a much bigger mess than i originally thought ^_^ thanks for the links, reading now

gnzlbg (Jul 26 2019 at 16:50, on Zulip):

we could also add generic volatile_load/store methods to the atomic types that take an Ordering :laughter_tears:
I really have no idea why would one to use volatile with the other orderings

RalfJ (Jul 26 2019 at 16:55, on Zulip):

wow, this is a much bigger mess than i originally thought ^_^ thanks for the links, reading now

why is that the reaction I get almost every time I answer a question? :rofl:

RalfJ (Jul 26 2019 at 16:56, on Zulip):

@gnzlbg or we could just make our existing intrinsics be volatile unordered

RalfJ (Jul 26 2019 at 16:56, on Zulip):

then we can entirely side-step the question of "what are the semantics of non-atomic volatile accesses" :D

gnzlbg (Jul 26 2019 at 16:56, on Zulip):

I think that wouldn't work, it would fail to compile if the load can tear

RalfJ (Jul 26 2019 at 16:56, on Zulip):

oh dang

RalfJ (Jul 26 2019 at 16:57, on Zulip):

I'd like a version of Unordered that permits tearing but still does not have data race problems...

gnzlbg (Jul 26 2019 at 16:57, on Zulip):

I think the tearing loads are not atomic

gnzlbg (Jul 26 2019 at 16:58, on Zulip):

you can probably implement them on top of the atomic unordered ones

RalfJ (Jul 26 2019 at 16:58, on Zulip):

the ordering annotations make sense even with tearing

RalfJ (Jul 26 2019 at 16:58, on Zulip):

they would say that all the individual "tears" would have that ordering

gnzlbg (Jul 26 2019 at 16:58, on Zulip):

the ordering annotations explicitly say that one cannot observe values that haven't been written to I think

gnzlbg (Jul 26 2019 at 16:58, on Zulip):

I don't think there is an ordering that doesn't say that

RalfJ (Jul 26 2019 at 16:59, on Zulip):

this would still apply "per-fragment" for the teared ones

RalfJ (Jul 26 2019 at 16:59, on Zulip):

in fact we could just say it applies per byte

gnzlbg (Jul 26 2019 at 16:59, on Zulip):

sure, but I don't know what would LLVM win from exposing them as intrinsics

RalfJ (Jul 26 2019 at 16:59, on Zulip):

coalescing adjacent atomic accesses is sound, AFAIK

RalfJ (Jul 26 2019 at 17:00, on Zulip):

basically I want to not ever have to think about non-atomic volatile accesses again

RalfJ (Jul 26 2019 at 17:00, on Zulip):

the question "how do volatile and atomic interact" comes up all the time

RalfJ (Jul 26 2019 at 17:00, on Zulip):

if the answer is "like unordered but bytewise", that would be a big step IMO

gnzlbg (Jul 26 2019 at 17:00, on Zulip):

I think we should just deprecate the volatile reads/writes if the volatile atomic unordered intrinsics solve the problem

gnzlbg (Jul 26 2019 at 17:00, on Zulip):

people that want tearing should just call those in a loop or something

gnzlbg (Jul 26 2019 at 17:01, on Zulip):

I don't know what problem tearing volatile load / stores solve

RalfJ (Jul 26 2019 at 17:01, on Zulip):

(a) deprecation does not absolve us from defining their semantics, (b) we probably should still provide arbitrarily-sized volatile accesses "because C does"

gnzlbg (Jul 26 2019 at 17:01, on Zulip):

we can just wrap them to loop over the memory doing 1-byte wide volatile atomic unordered operations

gnzlbg (Jul 26 2019 at 17:02, on Zulip):

maybe something cleverer than that, like using the largest possible size or something

RalfJ (Jul 26 2019 at 17:02, on Zulip):

very inefficient. but probably good enough for something deprecated.

gnzlbg (Jul 26 2019 at 17:03, on Zulip):

i mean, huge volatile load / stores generate horrible code already today

RalfJ (Jul 26 2019 at 17:03, on Zulip):

yeah

RalfJ (Jul 26 2019 at 17:03, on Zulip):

sounds like a solid plan to me

gnzlbg (Jul 26 2019 at 17:03, on Zulip):

i think there was an example of using one on an array of 4096 bytes and getting horrible assembly

Matt Taylor (Jul 26 2019 at 17:04, on Zulip):

you would probably have to avoid doing it in 1 byte chunks not for performance but also just the fact that many people seem to erroneously rely on them for non-tearing for small types anyway

RalfJ (Jul 26 2019 at 17:05, on Zulip):

the user-visible API for non-tearing volatile accesses is still open I think?

RalfJ (Jul 26 2019 at 17:05, on Zulip):

also @gnzlbg https://github.com/rust-lang/rfcs/pull/2728 has been proposed, so the longer we wait the more messy volatile stuff already exists...

RalfJ (Jul 26 2019 at 17:06, on Zulip):

you would probably have to avoid doing it in 1 byte chunks not for performance but also just the fact that many people seem to erroneously rely on them for non-tearing for small types anyway

shouldn't be too hard to do something like

match size_of::<T>() {
  1 => ...
  2 => ...
  4 => ...
  8 => ...
  n => // fallback, loop with largest power of 2 that is a factor of n
  ...
}
RalfJ (Jul 26 2019 at 17:08, on Zulip):

in fact if we "loop with largest power of 2 <= size_of::<usize>() that is a factor of n", we'd get the expected behavior for sizes 2, 4, and on 64bit also size 8

gnzlbg (Jul 26 2019 at 17:26, on Zulip):

This would obviously be RFC material. I don't know who should give this idea some thought before we do that

Matt Taylor (Jul 26 2019 at 17:42, on Zulip):

even if it’s only exposed via arch-dependent intrinsics, i believe it’s still worth considering volatile ops other than just load and store. for example on x86 i believe you need to use an OR when updating certain bits of page tables, and this strictly speaking should be done through volatile

Matt Taylor (Jul 26 2019 at 17:42, on Zulip):

i’m not even sure if there are ways of doing this in rust currently

RalfJ (Jul 26 2019 at 18:25, on Zulip):

@Matt Taylor but what is the problem with doing read, OR, store?

RalfJ (Jul 26 2019 at 18:25, on Zulip):

is it because the CPU itself might concurrently modify the page table when the program does something?

Matt Taylor (Jul 26 2019 at 18:25, on Zulip):

in this case it’s that the cpu can set a flag after you read

Matt Taylor (Jul 26 2019 at 18:25, on Zulip):

yeah

RalfJ (Jul 26 2019 at 18:25, on Zulip):

ah makes sense

RalfJ (Jul 26 2019 at 18:26, on Zulip):

what do people do in C? I dont think it has volatile RMW operations

RalfJ (Jul 26 2019 at 18:26, on Zulip):

(RMW = read-modify-write, i.e., all these atomic_and, atomic_add, atomic_or, compare_exchange, ...)

Matt Taylor (Jul 26 2019 at 18:26, on Zulip):

it’s probably broken in C too lol

Matt Taylor (Jul 26 2019 at 18:27, on Zulip):

that’s a good question though

Matt Taylor (Jul 26 2019 at 18:28, on Zulip):

i mean *some_volatile_var |= foo is probably as close as it gets, but i guess at some point you just go for inline assembly

Matt Taylor (Jul 26 2019 at 18:28, on Zulip):

i’ll try and find out what people are doing for this instance

RalfJ (Jul 26 2019 at 18:32, on Zulip):

@gnzlbg maybe instead of duplicating the entire API for volatile (with RMWs etc), we should have Ordering::Volatile?

Matt Taylor (Jul 26 2019 at 18:36, on Zulip):

the problem is that the current API goes through &self

Matt Taylor (Jul 26 2019 at 18:36, on Zulip):

and since references are dereferenceable, it’s not going to work for MMIO i think?

RalfJ (Jul 26 2019 at 18:43, on Zulip):

ah

RalfJ (Jul 26 2019 at 18:43, on Zulip):

but there are problems for that even with atomic usage

RalfJ (Jul 26 2019 at 18:43, on Zulip):

see https://github.com/rust-lang/rust/issues/55005

RalfJ (Jul 26 2019 at 18:43, on Zulip):

(you didnt think we were already done with the messy parts, did you?^^)

Matt Taylor (Jul 26 2019 at 18:45, on Zulip):

oh dear

Matt Taylor (Jul 26 2019 at 18:45, on Zulip):

i think i need to have a lie down

RalfJ (Jul 26 2019 at 20:53, on Zulip):

so @Matt Taylor one proposal here is to make UnsafeCell not dereferencable. That would help with some things. An interesting open question is what to do about cases like &(i32, Cell<i32>); seems like at best we could mark the first 4 bytes dereferencable then. Though it would be nice if e.g. Cell could opt-back-in to dereferencable.
But also see https://github.com/rust-lang/unsafe-code-guidelines/issues/88.

Tom Phinney (Jul 26 2019 at 21:12, on Zulip):

TANSTAAFL, or you didn't think you'd solve the problem that easily, did you? :rofl:

RalfJ (Jul 26 2019 at 21:28, on Zulip):

well I am just at the beginning of my research career, what would I do all day if there wouldn't be a large pool and a steady supply of open problems? ;)

Tom Phinney (Jul 26 2019 at 22:30, on Zulip):

What would you do? I'm reminded of an old joke about a mathematician. I couldn't find that precise one online, but https://www.reddit.com/r/Jokes/comments/1kid90/the_mathematicians_interview/ comes close. :grinning:

RalfJ (Jul 26 2019 at 23:00, on Zulip):

so I guess it's good for y'all that I don't have to create new problems to solve then :D

Matt Taylor (Jul 26 2019 at 23:20, on Zulip):

This would help with some things

what things does it not solve here? seems like with a non-dereferenceable UnsafeCell and Ordering::Volatile that uses LLVM Unordered underneath, we’re set (in terms of soundness)

Matt Taylor (Jul 26 2019 at 23:24, on Zulip):

what is it that i’ve neglected this time :p

RalfJ (Jul 27 2019 at 07:23, on Zulip):

@Matt Taylor I think for MMIO that would be all, I was thinking of other problems around dereferencable, like the Arc one

Matt Taylor (Jul 27 2019 at 07:24, on Zulip):

right

Matt Taylor (Jul 27 2019 at 07:25, on Zulip):

another potential problem with this route is that it blocks people from using different Orderings than the one we choose for Volatile (Unordered)

RalfJ (Jul 27 2019 at 07:37, on Zulip):

see @gnzlbg's question in the thread: what is the use-case for an access that is both synchronizing and volatile?

gnzlbg (Jul 27 2019 at 11:36, on Zulip):

(deleted)

gnzlbg (Jul 27 2019 at 11:37, on Zulip):

(deleted)

gnzlbg (Jul 27 2019 at 11:38, on Zulip):

(deleted)

gnzlbg (Jul 27 2019 at 11:38, on Zulip):

(deleted)

gnzlbg (Jul 27 2019 at 11:39, on Zulip):

volatile is orthogonal to atomics: it says that the reads and writes to memory must happen exactly has written in the code, e.g., if you read the same memory twice, those two reads must happen, if you read from memory once, only one read can happen, etc.

gnzlbg (Jul 27 2019 at 11:39, on Zulip):

the compiler must assume that the reads and writes have side-effects like unknown function calls, etc.

gnzlbg (Jul 27 2019 at 11:40, on Zulip):

That pretty much inhibits all optimizations, so why would want to on top of that combine the volatile reads / writes with atomics ?

gnzlbg (Jul 27 2019 at 11:40, on Zulip):

No idea. If you have volatile and atomic operations, the compiler might be able to re-order the atomic ones across the volatile ones in certain ways. By making the atomic ones volatile, you inhibit that.

gnzlbg (Jul 27 2019 at 11:44, on Zulip):

Somebody on LLVM IRC mentioned that "volatile synchronizes with volatile" because you can't reorder volatile operations across each other

RalfJ (Jul 28 2019 at 08:44, on Zulip):

volatile is orthogonal to atomics: it says that the reads and writes to memory must happen exactly has written in the code, e.g., if you read the same memory twice, those two reads must happen, if you read from memory once, only one read can happen, etc.

agreed. slightly more formally speaking: it makes reads and write observable events, and thus part of the program behavior that must be preserved.

RalfJ (Jul 28 2019 at 08:44, on Zulip):

Somebody on LLVM IRC mentioned that "volatile synchronizes with volatile" because you can't reorder volatile operations across each other

I think that's very wrong. "synchronizes with" also means that other instructions cannot be reordered around this. that is explicitly not true for volatile.

RalfJ (Jul 28 2019 at 08:46, on Zulip):

IOW, the following code is wrong:

static int DATA = 0;
volatile int FLAG = 0;

thread A:

DATA = 42;
FLAG = 1;

thread B:

while (FLAG == 0) {}
printf("%d", DATA);
RalfJ (Jul 28 2019 at 08:46, on Zulip):

if volatile synced-with volatile, then that code would be fine, but in fact the compiler is allowed to reorder the two writes in thread A

Matt Taylor (Jul 28 2019 at 08:54, on Zulip):

isn’t their point that you can’t reorder a volatile around other volatiles? in your case, there is only one volatile, but if you make data volatile then it would be forced to not reorder the writes

RalfJ (Jul 28 2019 at 09:01, on Zulip):

yes, they are saying "X and X implies Y, hence Y" where X = "volatiles cannot be reodered with each other" and Y = "volatiles sync with each other"

RalfJ (Jul 28 2019 at 09:01, on Zulip):

I agree with "X" but I do not agree with "X implies Y", and hence I do not agree with "Y"

RalfJ (Jul 28 2019 at 09:02, on Zulip):

"sync withe each other" has a very precise meaning in concurrent memory models. it's what release/acquire accesses (amongst other things do)

RalfJ (Jul 28 2019 at 09:02, on Zulip):

if you replace the volailte load/store in my example by release/acquire load store, it becomes correct

comex (Jul 28 2019 at 09:03, on Zulip):

yep, it's definitely wrong that volatile synchronizes-with volatile in that sense

RalfJ (Jul 28 2019 at 09:03, on Zulip):

it becomes correct because you cannot move the store to DATA down below the release-store to FLAG

RalfJ (Jul 28 2019 at 09:03, on Zulip):

that is what it means for accesses to synchronize

Matt Taylor (Jul 28 2019 at 09:03, on Zulip):

i think they were just using it too loosely. either way, i think we’re in agreement here

RalfJ (Jul 28 2019 at 09:04, on Zulip):

@Matt Taylor I agree that we two are in agreement. they might "just" have said it to loosely but the business of specifying a language it not one where you can permit yourself to be loose about your terminology.

Matt Taylor (Jul 28 2019 at 09:04, on Zulip):

yeah, absolutely

RalfJ (Jul 28 2019 at 09:05, on Zulip):

so, I am not being nitpicky to annoy people, and I dont think they are being a bad person by being loose about terminology, I just try to make sure we all use the same terms to mean the same things, in particular terms that actually do have a precise meaning :)

Matt Taylor (Jul 28 2019 at 09:07, on Zulip):

I should probably read up a bit more on C’s concurrent memory model

RalfJ (Jul 28 2019 at 09:22, on Zulip):

are you sure you want that? hint: if you thought volatile was messy...

RalfJ (Jul 28 2019 at 09:23, on Zulip):

the good news is that the concurrency mess is different :P

RalfJ (Jul 28 2019 at 09:23, on Zulip):

the problem is not that we have no formal model, the problem is we have a dozen of them that all have different flaws

Last update: Nov 20 2019 at 11:30UTC