Stream: t-lang/wg-unsafe-code-guidelines

Topic: ffi safety


gnzlbg (Nov 05 2018 at 13:47, on Zulip):

C FFI only makes sense if there is a platform to interface with, and multiple discussions about layout are "tangentially" imposing requirements on this platform. I think it would make sense to open an issue about what's the platform (how do we define it, do we talk about valid and invalid platforms), its role in Rust (a platform is optional, but if there is one, it must be valid, its role with C FFI, extern "C", repr(C), ...), and the specification of valid platforms.

gnzlbg (Nov 11 2018 at 13:31, on Zulip):

I've sent a PR documenting the bare minimum requirements on the C implementation of the target platform: https://github.com/rust-rfcs/unsafe-code-guidelines/pull/46

gnzlbg (Nov 11 2018 at 13:33, on Zulip):

There are still some critical unresolved questions open, but I think it is ok to gather consensus on the least controversial issues first, and if required, open new issues to discuss each of the unresolved questions on its own.

Gankro (Nov 11 2018 at 14:23, on Zulip):

@gnzlbg to what end are you avoiding defining the values of bool?

gnzlbg (Nov 11 2018 at 14:29, on Zulip):

oops, to no end, that was an oversight

Gankro (Nov 11 2018 at 14:34, on Zulip):

I also haven't actually seen any resistance to the notion that, if floats are present, they surely must be IEEE-754 binary

Gankro (Nov 11 2018 at 14:35, on Zulip):

but i'm fuzzy on exactly what we want our story here to be

gnzlbg (Nov 11 2018 at 15:17, on Zulip):

I think @rkruppe had some thoughts about what exactly can we guarantee there

rkruppe (Nov 11 2018 at 15:19, on Zulip):

I do? Well I did curse myself with floating point and C standard knowledge but I don't recall commenting on this specifically

Gankro (Nov 11 2018 at 15:21, on Zulip):

haha

Gankro (Nov 11 2018 at 15:22, on Zulip):

I think the most relevant parties for optional floats are kernel-mode devs who want to ensure floating point registers aren't touched

rkruppe (Nov 11 2018 at 15:23, on Zulip):

Disabling the float types is neither necessary (can just use soft floats) nor sufficient (if you don't use floats, llvm can and often will still use those registers) for that

rkruppe (Nov 11 2018 at 15:25, on Zulip):

and even if we add a "no floats" mode for whatever reason, I think it can just be an implementation-defined option which makes it questionable whether we want to include it in "what Rust requires from the C platform"

Gankro (Nov 11 2018 at 15:47, on Zulip):

If we were to have a mode that makes f32/f64 software-defined, that does have interesting ABI implications (are they now passed as ints? what does this mean for C FFI)

Gankro (Nov 11 2018 at 15:47, on Zulip):

But yes I was assuming some kind of hard no-floats mode that tells llvm to stay away

Nicole Mazzuca (Nov 11 2018 at 18:35, on Zulip):

@gnzlbg _Bool, not Bool_.

gnzlbg (Nov 11 2018 at 18:36, on Zulip):

thanks, fixed

Nicole Mazzuca (Nov 11 2018 at 18:36, on Zulip):

also, "have, at least, one pointer value that is never dereferenceable" is too strong.

Nicole Mazzuca (Nov 11 2018 at 18:36, on Zulip):

in theory, you could dereference a value which is equal to null, it's just that rust references will not represent it.

gnzlbg (Nov 11 2018 at 18:37, on Zulip):

@Nicole Mazzuca Option<&T> is a pointer in FFI

gnzlbg (Nov 11 2018 at 18:37, on Zulip):

in particular, a C FFI ABI function can return an Option<&T>, which has to be None if the pointer is not valid

gnzlbg (Nov 11 2018 at 18:38, on Zulip):

C FFI is unsound, so I guess it doesn't really matter whether the Option is None when the C pointer is valid..

Nicole Mazzuca (Nov 11 2018 at 18:38, on Zulip):

correct

Nicole Mazzuca (Nov 11 2018 at 18:38, on Zulip):

C requires a "null pointer value"

gnzlbg (Nov 11 2018 at 18:39, on Zulip):

a null pointer constant

Nicole Mazzuca (Nov 11 2018 at 18:39, on Zulip):

no, null pointer value.

gnzlbg (Nov 11 2018 at 18:39, on Zulip):

there might not be a run-time way to detect a null pointer

gnzlbg (Nov 11 2018 at 18:39, on Zulip):

ah well, that doesn't make sense

Nicole Mazzuca (Nov 11 2018 at 18:39, on Zulip):

void* x = 0; <- x has a null pointer value.

Nicole Mazzuca (Nov 11 2018 at 18:40, on Zulip):

if you initialize an object of pointer type with either 1) a null pointer value, or 2) a null pointer constant, it has a null pointer value

gnzlbg (Nov 11 2018 at 18:40, on Zulip):
void* x = /* null pointer constant: */ (void*)0;
assert(x == NULL); // can fail
Nicole Mazzuca (Nov 11 2018 at 18:41, on Zulip):

nope

gnzlbg (Nov 11 2018 at 18:41, on Zulip):

wait, i messed that up

gnzlbg (Nov 11 2018 at 18:42, on Zulip):
void* x = /* null pointer constant: */ (void*)0;
int y = 0;
assert(x == (void*)y); // can fail
Nicole Mazzuca (Nov 11 2018 at 18:42, on Zulip):

yes

Nicole Mazzuca (Nov 11 2018 at 18:42, on Zulip):

but y is not a null pointer constant

gnzlbg (Nov 11 2018 at 18:42, on Zulip):

no, i see what you mean now

Nicole Mazzuca (Nov 11 2018 at 18:42, on Zulip):

and therefore, (void*)y may have a non-null-pointer-value

gnzlbg (Nov 11 2018 at 18:42, on Zulip):

so C already requires a null pointer value

Nicole Mazzuca (Nov 11 2018 at 18:42, on Zulip):

yeah

gnzlbg (Nov 11 2018 at 18:43, on Zulip):

so we don't have to say anything then

Nicole Mazzuca (Nov 11 2018 at 18:43, on Zulip):

it may be implementation defined to be dereferenceable

Nicole Mazzuca (Nov 11 2018 at 18:43, on Zulip):

but that's just turning UB into defined behavior

Nicole Mazzuca (Nov 11 2018 at 18:43, on Zulip):

which is totally allowable

Gankro (Nov 11 2018 at 18:43, on Zulip):

I think the real question is whether we require NULL to be 0, and if not, whether we ensure that Option<&T>::None == c_null

Nicole Mazzuca (Nov 11 2018 at 18:43, on Zulip):

yeah

gnzlbg (Nov 11 2018 at 18:43, on Zulip):

@Gankro the question is whether the case above, cannot fail

Nicole Mazzuca (Nov 11 2018 at 18:44, on Zulip):

I think it's reasonable to say "rustc only supports platforms with (intptr_t)(void*)NULL = 0"

Nicole Mazzuca (Nov 11 2018 at 18:44, on Zulip):

but leave open other implementations that may want (intptr_t)(void*)NULL ≠ 0

Nicole Mazzuca (Nov 11 2018 at 18:44, on Zulip):

I just don't think it's a usecase that anybody cares about

Gankro (Nov 11 2018 at 18:44, on Zulip):

That's a very uh, subtle, spec

gnzlbg (Nov 11 2018 at 18:45, on Zulip):

The issue was, what if a C platform has (intptr_t)(void*)NULL = 0xfffff ?

gnzlbg (Nov 11 2018 at 18:45, on Zulip):

Should we try or try not to support those?

Gankro (Nov 11 2018 at 18:45, on Zulip):

ubsan is proposing not supporting them in rustc but allowing other implementations to support it

Nicole Mazzuca (Nov 11 2018 at 18:45, on Zulip):

because LLVM's backend doesn't support them, specifically

Nicole Mazzuca (Nov 11 2018 at 18:45, on Zulip):

and also because rustc isn't on any platforms where it's necessary

Gankro (Nov 11 2018 at 18:46, on Zulip):

oh that's pretty compelling, if llvm hasn't needed to support it

gnzlbg (Nov 11 2018 at 18:46, on Zulip):

we just said that Option<&T> on FFI is unsound, no matter what we do,

Gankro (Nov 11 2018 at 18:46, on Zulip):

??

gnzlbg (Nov 11 2018 at 18:46, on Zulip):

FFI is intrinsically unsound

Nicole Mazzuca (Nov 11 2018 at 18:46, on Zulip):

yeah, I don't think clang nor gcc support platforms where the null pointer value is not all zeroes

Nicole Mazzuca (Nov 11 2018 at 18:47, on Zulip):

and I know MSVC doesn't

Nicole Mazzuca (Nov 11 2018 at 18:47, on Zulip):

@gnzlbg that's like saying "transmute is inherently unsound"

Nicole Mazzuca (Nov 11 2018 at 18:47, on Zulip):

I mean, yes, it is

Nicole Mazzuca (Nov 11 2018 at 18:47, on Zulip):

but only given no context

gnzlbg (Nov 11 2018 at 18:47, on Zulip):

not helpful i know

gnzlbg (Nov 11 2018 at 18:47, on Zulip):

the question is, how do we word this ?

Gankro (Nov 11 2018 at 18:48, on Zulip):

I feel like we already have unsafe for this notion..?

gnzlbg (Nov 11 2018 at 18:48, on Zulip):

I can just delete the requirement

Nicole Mazzuca (Nov 11 2018 at 18:48, on Zulip):

yeah, that's how I'd do it

Nicole Mazzuca (Nov 11 2018 at 18:48, on Zulip):

and just define Option<&T>::None to be bit-pattern compatible with std::ptr::null()

gnzlbg (Nov 11 2018 at 18:49, on Zulip):

Option<&T> in rust uses 0x0 to denote a null pointer, whether that's guaranteed to always work or not has nothing to do with the platform so it doesn't belong there

Gankro (Nov 11 2018 at 18:49, on Zulip):

huh?

Gankro (Nov 11 2018 at 18:49, on Zulip):

that seems incredibly related to platform

gnzlbg (Nov 11 2018 at 18:49, on Zulip):

Exactly, last time i told @Gankro that we should make ptr::is_null() the only way to test for null-pointerness

Gankro (Nov 11 2018 at 18:50, on Zulip):

Being able to write portable and correct FFI code is a key part of defining the platform

gnzlbg (Nov 11 2018 at 18:50, on Zulip):

@Gankro if C returns 0x0 to denote a valid address, the Option<&T> would be None in the rust side, (or maybe not, if we set none to something else)

gnzlbg (Nov 11 2018 at 18:50, on Zulip):

but that's it, on the rust side you can transmute Option<&T> to a *const T and do whatever you want

Nicole Mazzuca (Nov 11 2018 at 18:50, on Zulip):

if C returns a null pointer value to denote a valid address, then the return type of that function in Rust should be *const|mut T

gnzlbg (Nov 11 2018 at 18:51, on Zulip):

which poses another good question, when a C pointer compares true with NULL, should the Rust raw pointer with the same representation also return is_null() true ?

Gankro (Nov 11 2018 at 18:51, on Zulip):

Right in some sense it's the burden of the FFI bindings to capture the API

Nicole Mazzuca (Nov 11 2018 at 18:52, on Zulip):

@gnzlbg what other semantics make sense? also, s/compares true/compares equal/

Gankro (Nov 11 2018 at 18:53, on Zulip):

I would be very happy with defining that, I don't super care about someone making a rustc for some wild platform that none of llvm/gcc/msvc even support

gnzlbg (Nov 11 2018 at 18:53, on Zulip):

Well, that means that the bitpattern that Option<&T> uses to denote null has to match the one that the platform uses

gnzlbg (Nov 11 2018 at 18:53, on Zulip):

because it would be sane to transmute a null Rust pointer to Option<&T> and expect it to be None

Nicole Mazzuca (Nov 11 2018 at 18:53, on Zulip):

that's what I said

gnzlbg (Nov 11 2018 at 18:53, on Zulip):

ok, so i'll remove the sentence

gnzlbg (Nov 11 2018 at 18:54, on Zulip):

i think that once we go to the validity discussion, we have to define the bitpattern for which Option<&T> is none - that's not part of the layout discussion AFAICT

Gankro (Nov 11 2018 at 18:54, on Zulip):

The existence of ptr::as_ref more or less guarantees that null is None

gnzlbg (Nov 11 2018 at 18:55, on Zulip):

It might make sense to have a lang "constant" specifying this bit pattern, to specify that ptr.is_null() is the only way to test null ptr constness, or comparing against this bit pattern, etc.

gnzlbg (Nov 11 2018 at 18:55, on Zulip):

we can probably lint if people use something else than ptr.is_null() to do "null pointer checks"

Gankro (Nov 11 2018 at 18:55, on Zulip):

wait I feel like we're backsliding why do we want to support non-zero-bits-null again

Nicole Mazzuca (Nov 11 2018 at 18:58, on Zulip):

@Gankro to not prevent it from existing, if someone wants to go to the trouble

gnzlbg (Nov 11 2018 at 18:59, on Zulip):

So i've pushed the commit to the PR. It removes the requirement, and adds it as an unresolved question to clarify when we start discussing validity.

gnzlbg (Nov 11 2018 at 19:00, on Zulip):

I agree with @Nicole Mazzuca that we should support this, because it takes little work to do so. If the conclusion of validity is that we only support platforms where runtime 0x0 is a null pointer, we'd have to add this requirement to the list.

Gankro (Nov 11 2018 at 19:03, on Zulip):

hrm, is this the first instance of implementation-defined behaviour we are explicitly introducing, with the intent that someone can write another implementation that explicitly defines this behaviour?

Gankro (Nov 11 2018 at 19:04, on Zulip):

also I forget did either of you have a platform in mind where someone would do this?

gnzlbg (Nov 11 2018 at 19:27, on Zulip):

@Gankro the value of the bit pattern would be unspecified, and required to match the null pointer representation of C only if C FFI is available. I'd suppose that we would "somehow" expose the bitpattern in core, e.g. via a const, and that the correct way of checking the null bitpattern would just be to use the constant. ptr::is_null() would do that for you.

gnzlbg (Nov 11 2018 at 19:28, on Zulip):

All code that checks for the null bit-pattern in some other way wouldn't be portable, but it wouldn't be incorrect either as long as these match.

gnzlbg (Nov 11 2018 at 19:29, on Zulip):

We could and should add lints for non-portable code, and maybe miri could help here somehow (e.g. by being able to set this bitpattern to some random value), but since miri can call into C FFI, i don't think it would be feasible to somehow convert bitpatterns across C FFI in miri, so i don't know how much software would be testable this way.

Gankro (Nov 11 2018 at 19:37, on Zulip):

how is it possible for someone to make an aligned non-null pointer in a portable way under this model

Gankro (Nov 11 2018 at 19:37, on Zulip):

I guess "use NonNull::dangling"

Nicole Mazzuca (Nov 11 2018 at 19:38, on Zulip):

@Gankro yeah, NonNull::dangling is correct, imo. Also, I don't believe it's the first example of impl-def behavior?

Gankro (Nov 11 2018 at 19:39, on Zulip):

yeah but all other impl-defined behaviour i'm aware is just like "shrug emoji we won't do something awful"

Gankro (Nov 11 2018 at 19:39, on Zulip):

and not "this is so someone can write a compiler that does define it"

Nicole Mazzuca (Nov 11 2018 at 19:39, on Zulip):

hrmm.

Gankro (Nov 11 2018 at 19:41, on Zulip):

to be clear: I think the former is a reasonable stance to take on nasty problems we can't solve in a satisfactory way, while the latter is a weird portability/fragmentation thing

Gankro (Nov 11 2018 at 19:42, on Zulip):

or to put it another way, I like unspecified/implementation-defined as a way to have soft UB

Nicole Mazzuca (Nov 11 2018 at 19:43, on Zulip):

I don't like impl-def for soft UB. unspecified, sure

Nicole Mazzuca (Nov 11 2018 at 19:43, on Zulip):

impl-def should be for actual implementation differences

Nicole Mazzuca (Nov 11 2018 at 19:43, on Zulip):

like size_of::<usize>()

Gankro (Nov 11 2018 at 19:45, on Zulip):

usize is a bit weird because I can't imagine writing a rust compiler that isn't consistent with rustc on a given platform?

Gankro (Nov 11 2018 at 19:45, on Zulip):

whereas impl-defined seems to suggest that's "the point"

Gankro (Nov 11 2018 at 19:49, on Zulip):

as in, i read impl-defined as "there is an interesting tradeoff here that reasonable implementations could differ on" and not "there exist different platforms where this must be different, but all reasonable implementations would agree on any given platform"

Gankro (Nov 11 2018 at 19:53, on Zulip):

hrm I am confusing myself

rkruppe (Nov 11 2018 at 19:53, on Zulip):

my 2 cents on NULL:
1. address 0 being dereferenceable without UB is something some people want (e.g. the linux kernel) and that LLVM will likely gain support for at some point, but as others already argued above in such a context rust just can't correctly use Option<&T> for a nullable-but-otherwise-valid pointer, which is that rust code's problem.
2. but i haven't seen any use case for ptr::null being different from address zero and causes additional headaches beyond address 0 being dereferenceable so i'd be inclined to not support it

gnzlbg (Nov 12 2018 at 09:40, on Zulip):

To allow this, we don't have to require anything on the C platform, which is what the current PR does. We just have to require that Option<&T>::None's bit-pattern is all zeros. This does not mean that you can't C FFI with a platform where null pointers have some other bit representation, only that when interfacing with such a platform you might not want to use Option<&T>::None or *[const,mut]::is_null() to check for "non-dereferenceable". One could implement a Ptr<T> type that provides an is_null() that does something else though (or wrap the platform with something that "converts" null pointer representations). If we require that the null address of the C platform to be all zeros, and that this must mean that the pointer is not dereferenceable, that would just mean that one cannot do C FFI with such a platform at all.

So while I think that requiring Option<&T>::None to be all zeros buys us some simplicity on the Rust side, requiring this to hold in C doesn't really buy us anything. It just prevent us from interfacing with platforms where this does not hold.

RalfJ (Nov 12 2018 at 09:49, on Zulip):

@gnzlbg we assume in our interaction with LLVM that null pointers are 0x0

RalfJ (Nov 12 2018 at 09:50, on Zulip):

so it is true that that's not really a C platform requirement... it's a requirement to be able to compile and run Rust code at all, whether it does C FFI or not

RalfJ (Nov 12 2018 at 09:51, on Zulip):

Hence what you said about "when interface with such a platform" is hypothetical.

RalfJ (Nov 12 2018 at 09:51, on Zulip):

Also I think this is worth stating in the C platform requirements because that's a place where people might look for such things.

gnzlbg (Nov 12 2018 at 10:10, on Zulip):

@RalfJ @rkruppe said:

1. address 0 being dereferenceable without UB is something some people want (e.g. the linux kernel) and that LLVM will likely gain support for at some point,

Rust is already used to build Linux kernel modules, so this is a use case that we definitely want to support.

RalfJ (Nov 12 2018 at 10:17, on Zulip):

true, but that has naught to do with what I said :)

RalfJ (Nov 12 2018 at 10:18, on Zulip):

I am talking about the case where the platform has a NULL ptr but that NULL ptr is not 0x0

RalfJ (Nov 12 2018 at 10:18, on Zulip):

that is a case LLVM does not support and is not likely to support, so the same goes for Rust

RalfJ (Nov 12 2018 at 10:19, on Zulip):

OTOH, you are talking about the case where the platform has no NULL ptr -- no ptr that is NEVER going to be inbounds -- at all. That is a different situation.

RalfJ (Nov 12 2018 at 10:21, on Zulip):

I agree there are valid usecases for that. but it can still simplify things to assume that if a platform has a NULL ptr, then it is 0x0.

gnzlbg (Nov 12 2018 at 10:27, on Zulip):

Ah gotcha. I'm going to add that as an unresolved question to the PR @RalfJ , since I think it is worth opening an issue about this, and discussing it in more depth.

gnzlbg (Nov 12 2018 at 10:32, on Zulip):

I've sent a new commit with this differentiation.

gnzlbg (Nov 12 2018 at 11:38, on Zulip):

I've cleaned up the unresolved questions even further

Nicole Mazzuca (Nov 12 2018 at 15:34, on Zulip):

@RalfJ this is, I think, a place where "what rustc supports" and "what rust supports" can be distinct

Nicole Mazzuca (Nov 12 2018 at 15:35, on Zulip):

I just don't think defining null to be all zero bit pattern actually gains us much

Nicole Mazzuca (Nov 12 2018 at 15:35, on Zulip):

in terms of spec simple-ness

RalfJ (Nov 12 2018 at 15:35, on Zulip):

@Nicole Mazzuca fair enough. IIRC @rkruppe had some concerns about other bit patterns.

rkruppe (Nov 12 2018 at 15:39, on Zulip):

i can talk the most about why it would be bad to support in rustc. it's true that this doesn't have to mean anything for the rust language spec, but conversely, if there are no plausible implementations or targets where that would be necessary (and i don't know of any) then leaving it implementation-defined just declares a lot of programs non-portable without any benefit

rkruppe (Nov 12 2018 at 15:43, on Zulip):

also even if ptr::null doesn't give you The Null Pointer Value you can write Rust code that interops with C that uses a different address as Null Pointer Value: address 0 is dereferencable and address <whatever NULL is> happens to always be an invalid pointer without being The Null Pointer Value.

rkruppe (Nov 12 2018 at 15:45, on Zulip):

(the last part means you can't justify some optimizations from Rust semantics, though I think you could still do most of them if you knew you were targeting a platform where that address is never allocated, and honestly optimization power on such a weird platform is not high on my list of priorities)

Nicole Mazzuca (Nov 12 2018 at 15:45, on Zulip):

that seems unnecessarily weird?

Nicole Mazzuca (Nov 12 2018 at 15:46, on Zulip):

oh, no, I see where you're going with that

gnzlbg (Nov 12 2018 at 17:33, on Zulip):

I think we should open an issue about this once we move to validity of references, etc.

gnzlbg (Nov 12 2018 at 21:30, on Zulip):

@rkruppe isn't CUDA C an extension of C ?
My point was only that NVPTX is a target that we support today, and that AFAICT cannot do FFI with C.

gnzlbg (Nov 12 2018 at 21:30, on Zulip):

It can do FFI with CUDA C, or CUDA C++, but not with "just C".

rkruppe (Nov 12 2018 at 21:31, on Zulip):

It might be that there's a variant that applies the same extensions to C, idk for sure

gnzlbg (Nov 12 2018 at 21:31, on Zulip):

I see your point that, because CUDA C is an extension of C, then it includes C. But I don't think that you can do everything that you are allowed to do in C inside a CUDA __device__ kernel.

rkruppe (Nov 12 2018 at 21:32, on Zulip):

hm it's true that device code isn't a superset of normal C, certainly not on old compute capabilities. but does that have any bearing on ABI matters?

rkruppe (Nov 12 2018 at 21:32, on Zulip):

maybe if function pointers didn't exist in CUDA 1.0?

Nicole Mazzuca (Nov 12 2018 at 21:32, on Zulip):

CUDA C is really odd

Nicole Mazzuca (Nov 12 2018 at 21:32, on Zulip):

it's definitely not a superset, unlike objective C

gnzlbg (Nov 12 2018 at 21:32, on Zulip):

Nono, this only impacts that being able to do extern "C" is optional

gnzlbg (Nov 12 2018 at 21:33, on Zulip):

as in, we do not require a platform to allow doing C FFI

Nicole Mazzuca (Nov 12 2018 at 21:33, on Zulip):

would it be optional as in, your extern "C" doesn't compile?

gnzlbg (Nov 12 2018 at 21:33, on Zulip):

yes, the program is illegal

gnzlbg (Nov 12 2018 at 21:34, on Zulip):

if there is no C to FFI with, what should extern "C" do ?

Nicole Mazzuca (Nov 12 2018 at 21:34, on Zulip):

then it seems like it'd be reasonable to say "CUDA Rust", distinct from normal Rust

gnzlbg (Nov 12 2018 at 21:34, on Zulip):

it is

Nicole Mazzuca (Nov 12 2018 at 21:34, on Zulip):

i.e., don't worry about it

gnzlbg (Nov 12 2018 at 21:34, on Zulip):

hm , damn, you got me

Nicole Mazzuca (Nov 12 2018 at 21:34, on Zulip):

just like the standards committee doesn't worry about CUDA C++

Nicole Mazzuca (Nov 12 2018 at 21:35, on Zulip):

except insofar as they're LLVM devs who have CUDA in tree

rkruppe (Nov 12 2018 at 21:35, on Zulip):

extern "C" being straight up disallowed feels wild to me, since despite the name it's mostly about the target's default ABI, which just usually/always happens to be C (+ extensions + aspects that are non-conforming)

gnzlbg (Nov 12 2018 at 21:36, on Zulip):

one of the main reasons we are stretching things to leave e.g. the bit representation of NULL undefined, is to not making creating these "dialects" impossible

rkruppe (Nov 12 2018 at 21:36, on Zulip):

especially since you can also write it as just extern fn ...

gnzlbg (Nov 12 2018 at 21:37, on Zulip):

Should I move the C standard conformance part to an unresolved question ?

gnzlbg (Nov 12 2018 at 21:37, on Zulip):

@Brian Smith point that we only need the C standard pieces required for FFI makes sense to me.

Nicole Mazzuca (Nov 12 2018 at 21:38, on Zulip):

hmm

Nicole Mazzuca (Nov 12 2018 at 21:38, on Zulip):

maybe extern "C" is just conditionally supported

rkruppe (Nov 12 2018 at 21:38, on Zulip):

I think it's fine to base everything we say on C and instead leave specifically the handling of targets without standard C implementation as the unresolved question

Nicole Mazzuca (Nov 12 2018 at 21:38, on Zulip):

like, there's a Rust which is supported by normal Rust implementations, and then we subset it for "valid Rust implementations which don't implement the full spec"

Nicole Mazzuca (Nov 12 2018 at 21:39, on Zulip):

i.e., extern "C" probably doesn't make a lot of sense for interpreters

gnzlbg (Nov 12 2018 at 21:39, on Zulip):

so that's what I tried to write, but failed - that's what I meant with extern "C" being "optional"

Nicole Mazzuca (Nov 12 2018 at 21:39, on Zulip):

mmh

rkruppe (Nov 12 2018 at 21:40, on Zulip):

importing symbols written in another language may not be universally supported, defining a rust function with a different ABI can work just fine (as a nop)

Nicole Mazzuca (Nov 12 2018 at 21:40, on Zulip):

we should explicitly discuss subsetting Rust

Nicole Mazzuca (Nov 12 2018 at 21:40, on Zulip):

I would add it to unresolved questions

gnzlbg (Nov 12 2018 at 21:40, on Zulip):

So I'm going to add an unresolved question about what to do about targets with a C implementation that's not standard conforming.

gnzlbg (Nov 12 2018 at 21:40, on Zulip):

I'm going to add another unresolved question about targets that do not have a C-like implementation at all (if they exist).

Nicole Mazzuca (Nov 12 2018 at 21:40, on Zulip):

I kinda want to see how the C++ standards committee is going to deal with it.

rkruppe (Nov 12 2018 at 21:41, on Zulip):

and importing external symbols can just fall entirely under implementation-defined (i don't want to talk about GNU ld in a language spec) and on an interpreter implementation that impl-defined behavior is just "we never find that symbol"

briansmith (Nov 12 2018 at 21:42, on Zulip):

The problem I see is that this is going in a direction that's too C-oriented. Most of my extern "C" code is actually not written in C and so C standard conformance and related things are far away from being relevant.

Nicole Mazzuca (Nov 12 2018 at 21:43, on Zulip):

@Brian Smith are you writing your extern "C" code in Rust, or C++?

rkruppe (Nov 12 2018 at 21:43, on Zulip):

@Brian Smith point that we only need the C standard pieces required for FFI makes sense to me.

more generally almost everything related to ABIs is outside the C language spec, even basic things like how integers are represented concretely in memory is more up to the architecture and implementation choices than (say) future C revisions specifying that ints appear to work as two's complement when cast to unsigned. so in some sense it's kind weird to bring in C-the-language at all

briansmith (Nov 12 2018 at 21:43, on Zulip):

Rust, assembly language, and another language that I can't talk about.

rkruppe (Nov 12 2018 at 21:44, on Zulip):

but unfortunately all those platform documents are written in terms of the C type system

Nicole Mazzuca (Nov 12 2018 at 21:44, on Zulip):

dangit, don't say things like that, that's super fascinating :P

briansmith (Nov 12 2018 at 21:44, on Zulip):

Right, and I think it makes sense to document the requirements on the C type system, but not bring in the whole C17 standard.

briansmith (Nov 12 2018 at 21:45, on Zulip):

There is probably one or a few small sections that list requirements on data types that are relevant and not subsumed by the ABI documentation, and I think it would be OK to reference them.

rkruppe (Nov 12 2018 at 21:45, on Zulip):

so you'd want to say e.g. Rust f32 corresponds to C float where it exists and conforms to IEEE 754-2008?

briansmith (Nov 12 2018 at 21:46, on Zulip):

First, I don't know if Rust f32 is defined to be IEEE 754-2008, or if it is defined to be "whatever the ABI says 32-bit floats are" or something else.

briansmith (Nov 12 2018 at 21:46, on Zulip):

If f32 is defined to be IEEE 754-2008 then I would say that using f32 in a FFI declaration should be rejected on a platform that doesn't have IEEE 754-2008 floats.

briansmith (Nov 12 2018 at 21:46, on Zulip):

Maybe that means all such platforms would be rejected, or just those specific FFI declraations would be rejected.

briansmith (Nov 12 2018 at 21:47, on Zulip):

Similarly, we don't need to necessarily reject all platforms/ABIs where Bool isn't uint8_t 0/1. Just, we need to reject programs that make that assumption, e.g. by using bool in FFI contexts.

gnzlbg (Nov 12 2018 at 21:48, on Zulip):

We would need to encode that in our target description files, but that should be doable.

briansmith (Nov 12 2018 at 21:48, on Zulip):

We could say that such platforms are not supported, yet, and doing these things is currently UB, and file issues to get the compiler to reject mismatches in some way, which maybe isn't even a high priority.

gnzlbg (Nov 12 2018 at 21:49, on Zulip):

We could expose a c_bool, c_float, etc. in those platforms, if we wanted to somehow interface with them at that level.

briansmith (Nov 12 2018 at 21:49, on Zulip):

Well, I think people don't want to standardize c_bool, c_float, etc.

rkruppe (Nov 12 2018 at 21:49, on Zulip):

the relevant team decisions have been explicitly motivated by "we don't want people to have/desire to define and use c_bool etc."

gnzlbg (Nov 12 2018 at 21:49, on Zulip):

An alternative is that we wouldn't call FFI in those platforms extern "C"

gnzlbg (Nov 12 2018 at 21:50, on Zulip):

but extern "something-else"

briansmith (Nov 12 2018 at 21:51, on Zulip):

I do think "C" is a bad name which is why I don't include "C" in my extern declarations.

gnzlbg (Nov 12 2018 at 21:51, on Zulip):

@Brian Smith I think one problem we have is that we haven't fully decided what FFI safety means.

rkruppe (Nov 12 2018 at 21:51, on Zulip):

i think either of those options (declare core types not FFI-safe on some weird platforms, or make a core ABI string unavailable on some weird platforms) would be big enough for RFC territory

gnzlbg (Nov 12 2018 at 21:51, on Zulip):

All FFI is unsound, therefore, unsafe, is not false, but is not a very useful way to work things out.

briansmith (Nov 12 2018 at 21:52, on Zulip):

Oh, I'm aware. Since I'm programming in languages that are safer than Rust, it's actually ridiculous that I have to use unsafe to use that code!

briansmith (Nov 12 2018 at 21:53, on Zulip):

But, I think the fact is that extern { }extern "C" {}is something that will be hard to change and so it makes sense to write this document assume that that is the case.

gnzlbg (Nov 12 2018 at 21:53, on Zulip):

@rkruppe this PR is not an RFC - it is a summary about some things that had some consensus. It is obvious that there isn't full consensus about these things. Also, there are many unresolved questions as everybody can see.

briansmith (Nov 12 2018 at 21:54, on Zulip):

And, similarly, I think that extern "C" == platform ABI is currently well-established.

rkruppe (Nov 12 2018 at 21:55, on Zulip):

yeah what i'm saying is, maybe don't get too carried away mapping out either option, let's focus on clarifying the things that can be clarified with relative ease

briansmith (Nov 12 2018 at 21:55, on Zulip):

I think what's described at https://gankro.github.io/blah/rust-layouts-and-abis/ is basically what's true today.

rkruppe (Nov 12 2018 at 21:56, on Zulip):

as I recall some of that is only de-facto-true, not backed by any documentation or decisions

gnzlbg (Nov 12 2018 at 21:56, on Zulip):

Do we want people to be able to write unsafe code that relies forever on anything about that document being always true?

briansmith (Nov 12 2018 at 21:57, on Zulip):

My understanding is that if the target conforms to what that document says, things work today, perhaps accidentally.

briansmith (Nov 12 2018 at 21:58, on Zulip):

Now, what's promised vs. what accidentally works is a fine distinction.

gnzlbg (Nov 12 2018 at 21:58, on Zulip):

The point of the unsafe-code-guidelines is to come up with guidelines that unsafe code can rely on.

gnzlbg (Nov 12 2018 at 21:58, on Zulip):

So it works today is not good enough, it has to be "it will work always forever".

gnzlbg (Nov 12 2018 at 21:59, on Zulip):

This reduces the amount of the things that we can guarantee, because we don't know whether many things will always work forever.

gnzlbg (Nov 12 2018 at 21:59, on Zulip):

So CHAR_BITS == 8 for example, seems uncontroversial.

Gankro (Nov 12 2018 at 21:59, on Zulip):

I am fine with abandonning claims made in my document, but only with sufficient motivation -- e.g. concrete platforms that we are interested in supporting, that those guarantees would prevent or otherwise hinder

gnzlbg (Nov 12 2018 at 22:00, on Zulip):

But @Brian Smith suggested that false = 0 and true = 1 should not prevent interfacing with C FFI, but instead, should prevent using bool in FFI.

gnzlbg (Nov 12 2018 at 22:00, on Zulip):

And that is not a crazy idea. There might be different levels of things that we can require here.

Gankro (Nov 12 2018 at 22:00, on Zulip):

I think that's a reasonable tact to take

briansmith (Nov 12 2018 at 22:01, on Zulip):

One way forward is to say "if all these are true then the platform is fully supported; if these are not true then it's still up in the air."

gnzlbg (Nov 12 2018 at 22:01, on Zulip):

Maybe choosing "C FFI: all or nothing" is the wrong thing to do.

Nicole Mazzuca (Nov 12 2018 at 22:01, on Zulip):

doesn't C++17 require false = 0, true = 1?

Nicole Mazzuca (Nov 12 2018 at 22:01, on Zulip):

or maybe C++20 will?

gnzlbg (Nov 12 2018 at 22:01, on Zulip):

maybe

briansmith (Nov 12 2018 at 22:01, on Zulip):

C++ requires false = 0, true = 1 for a long time.

briansmith (Nov 12 2018 at 22:02, on Zulip):

But what matters is what the ABI requires.

rkruppe (Nov 12 2018 at 22:02, on Zulip):

my question about any such language standard guarantee is: what does that mean exactly and how does it rule out weird ABIs that store the value differently in RAM but fiddle with all operations on it to make it appear as such?

Gankro (Nov 12 2018 at 22:02, on Zulip):

JF Bastien's work is intended to define 0 and 1 for C++20, and C-next is also interested in adopting it

gnzlbg (Nov 12 2018 at 22:02, on Zulip):

But maybe we can say, #[repr(C)] structs are usable in C FFI _iff_ the platform implementation does things like this and like that, and if not, you can't use structs, but you can still use bool.

gnzlbg (Nov 12 2018 at 22:03, on Zulip):

or in other words, instead of a the platform document, we add a bool document, and u8 document, etc. and specify when they are usable in extern functions

gnzlbg (Nov 12 2018 at 22:04, on Zulip):

@Gankro I think @Brian Smith point remains. If you have a platform that defines true = 42, and you want to interface with C without using bool, or you just want to interface with assembly using extern C, should you be able to do that at all?

briansmith (Nov 12 2018 at 22:04, on Zulip):

IDK if it is worth the effort to even decide what to do for those platforms.

Gankro (Nov 12 2018 at 22:05, on Zulip):

I would rather start with an all-or-nothing approach with room to adopt a more fine-grain "oh you used bool sorry we don't support that here" later

rkruppe (Nov 12 2018 at 22:05, on Zulip):

or in other words, instead of a the platform document, we add a bool document, and u8 document, etc. and specify when they are usable in extern functions

again i don't think any such effort would be in scope for the current guidelines, if we want to avoid implying anything about C platforms that violate some of our more controversial assumptions we should just include a paragraph to that effect

Gankro (Nov 12 2018 at 22:05, on Zulip):

Basically I don't like spec boogeymen that create weird cargo-cult portability legends

briansmith (Nov 12 2018 at 22:06, on Zulip):

If the platform ABI conforms to @Gankro's document's requirements (possibly with some edits), then the platform can be supported and FFI will work. Otherwise, UB; in the future, we'll try to improve from UB to some documented state.

Gankro (Nov 12 2018 at 22:06, on Zulip):

yes, that sounds great

rkruppe (Nov 12 2018 at 22:06, on Zulip):

yeah something like that

briansmith (Nov 12 2018 at 22:06, on Zulip):

The fact is that there are no ports to platforms that don't conform to these requirements, so nobodywill be affected by the UB.

gnzlbg (Nov 12 2018 at 22:06, on Zulip):

The "otherwise UB" sounds good, and is different than "the program is illegal" approach of the PR.

briansmith (Nov 12 2018 at 22:07, on Zulip):

So, the people who want to add support for such platforms can drive the future work to define what happens for such platforms, perhaps on a type-by-type basis.

gnzlbg (Nov 12 2018 at 22:07, on Zulip):

If some implementation ever supports these, it can define what the UB does if it wants to.

briansmith (Nov 12 2018 at 22:07, on Zulip):

But, now, does that actually solve the problem that this group is trying to solve?

briansmith (Nov 12 2018 at 22:08, on Zulip):

In particular, is this group trying to solve the problem of reading a bool from a #[repr(C)] struct through a non-bool reference or pointer like *u8?

rkruppe (Nov 12 2018 at 22:08, on Zulip):

If some implementation ever supports these, it can define what the UB does if it wants to.

if that ever happens we should try to accomodate those implementations in the specs, but eh

rkruppe (Nov 12 2018 at 22:09, on Zulip):

In particular, is this group trying to solve the problem of reading a bool from a #[repr(C)] struct through a non-bool reference or pointer like *u8?

yes definitely. and the answer appears to be: true = 1u8, false = 0u8

gnzlbg (Nov 12 2018 at 22:09, on Zulip):

@Brian Smith the question is, does this solve the problem for you?

briansmith (Nov 12 2018 at 22:09, on Zulip):

If the group is trying to define what happens when one reads a bool through a *u8 then my suggestion doesn't work. But I would suggest we just shouldn't define that.

rkruppe (Nov 12 2018 at 22:09, on Zulip):

and this is just documenting existing practice and guarantees (in the case of bool)

briansmith (Nov 12 2018 at 22:10, on Zulip):

If it is important to define what it means to read a bool through a *u8 or whatever, then the only choices for supporting future platforms that don't conform to these rules are "don't ever support them" or "support the subset of the language that meets the rules."

gnzlbg (Nov 12 2018 at 22:11, on Zulip):

https://github.com/rust-rfcs/unsafe-code-guidelines/pull/46/commits/25097ac72941d92db598764de8afefd706886297

gnzlbg (Nov 12 2018 at 22:11, on Zulip):

@Brian Smith per the spec, the size of a bool and u8 are equal, and bool can only have two values true and false, which are represented as 0_u8 and 1_u8.

gnzlbg (Nov 12 2018 at 22:12, on Zulip):

You can read a bool through an *u8 pointer, but if you construct a bool from an u8 that is not 0 or 1, the behavior is undefined

briansmith (Nov 12 2018 at 22:12, on Zulip):

Which spec?

gnzlbg (Nov 12 2018 at 22:12, on Zulip):

sorry, the unsafe code guidelines I mean

briansmith (Nov 12 2018 at 22:12, on Zulip):

Anyway, the way the PR is written now, everything FFI is UB because no C implementations conform to C17.

briansmith (Nov 12 2018 at 22:13, on Zulip):

So, again, I'd like to replace that requirement with one more like Gankro's doc.

gnzlbg (Nov 12 2018 at 22:15, on Zulip):

The way it was written before it was all UB as well.

briansmith (Nov 12 2018 at 22:15, on Zulip):

Right, I agree. But I think a doc like Gankro's gets us to a point where things are actually defined.

gnzlbg (Nov 12 2018 at 22:15, on Zulip):

We can't define anything about structs / enums / unions / .. here. Those go in their own chapters.

gnzlbg (Nov 12 2018 at 22:16, on Zulip):

bool ints and float should go in the integer and floating point chapter

gnzlbg (Nov 12 2018 at 22:16, on Zulip):

so the only thing we could leave here is CHAR_BITS == 8

briansmith (Nov 12 2018 at 22:17, on Zulip):

How about this: The C implementation must conform to the ABI requirements.

briansmith (Nov 12 2018 at 22:17, on Zulip):

Most ABI docs I've read, w.r.t. C, are just restrictions on the way C may be implemented.

briansmith (Nov 12 2018 at 22:18, on Zulip):

Even more generally, anything that you want to use with extern "C" and #[repr(C)] must conform to the ABI, regardless of language.

gnzlbg (Nov 12 2018 at 22:20, on Zulip):

An alternative is to make this document a summary of the repr(C) issues that are resolved somewhere else.

briansmith (Nov 12 2018 at 22:23, on Zulip):

I am (very slowly) drafting an RFC for something I'm tentatively calling extern "platform-abi" {} which is mostly like extern "C" except you don't have to use unsafe{} to call such functions, so IMO defining the details of Rust's requirements on the ABI outside of the unsafe code guidelines would be ideal to me.

gnzlbg (Nov 12 2018 at 22:26, on Zulip):

@Brian Smith i've changed the PR completely, please take a look

Gankro (Nov 12 2018 at 22:27, on Zulip):

also as a meta comment I would suggest embracing redundancy and clarity over absolute modularity and precision. It is extremely painful to watch people try to divine semantic implications of C/C++ references by piecing together several disparate sections

Gankro (Nov 12 2018 at 22:28, on Zulip):

If there is a thing you are trying to make work, having a section that says "hey this works" is ideal

Gankro (Nov 12 2018 at 22:29, on Zulip):

Plus, redundancy helps catch places where things were supposed to work but lead to contradictions

Gankro (Nov 12 2018 at 22:30, on Zulip):

(similarly, calling out things that explicitly don't work, and why, is very nice)

briansmith (Nov 12 2018 at 22:30, on Zulip):

Is it generally true that the appearance of any built-in data type has the same semantics regardless of extern "C" and regardless of #[repr(C)]?

briansmith (Nov 12 2018 at 22:30, on Zulip):

I mean, isn't bool the same thing no matter where it is, from the Rust perspective?

Gankro (Nov 12 2018 at 22:31, on Zulip):

yes

Gankro (Nov 12 2018 at 22:31, on Zulip):

(I've always mentally modeled the primitives as being marked repr(C))

briansmith (Nov 12 2018 at 22:36, on Zulip):

Going back to earlier suggestions here, I think it makes sense to talk about common interop issues w/ C, but perhaps that whole discussion should be non-normative, and like suggested above, should reference other documents (e.g. how bool is defined in trust) for the normative requirements.

Gankro (Nov 12 2018 at 22:54, on Zulip):

Oh also my document has a minor flaw, in that I say arrays have no kind (since they can't actually be passed), but you need to define the kind of arrays for things like the x64 ABI which does SROA-ish things to structs, which can contain arrays by-value

Gankro (Nov 12 2018 at 22:57, on Zulip):

sadly I can't seem to find anything in the sysv spec that clearly explains how arrays are handled :s

Gankro (Nov 12 2018 at 22:58, on Zulip):

cc @eddyb ^

Nicole Mazzuca (Nov 12 2018 at 22:58, on Zulip):

aren't they treated as if

struct foo {
  T a[N];
};
// is equivalent to
struct foo {
  T a₁;
  T a₂;
  ...
  T aₙ;
};
Gankro (Nov 12 2018 at 23:00, on Zulip):

that would make perfect sense, and is what I'd expect, but I want to be sure :)

eddyb (Nov 12 2018 at 23:00, on Zulip):

I think that's correct, yeah

Gankro (Nov 12 2018 at 23:00, on Zulip):

Err wait, it's splatted? Not a nested struct?

eddyb (Nov 12 2018 at 23:00, on Zulip):

SysV doesn't care about nesting

Nicole Mazzuca (Nov 12 2018 at 23:00, on Zulip):

I don't think so...

Nicole Mazzuca (Nov 12 2018 at 23:01, on Zulip):

yeah

Gankro (Nov 12 2018 at 23:01, on Zulip):

yeah I'm just not 100% certain nothing does

Gankro (Nov 12 2018 at 23:01, on Zulip):

I have no counter-examples where it would matter

eddyb (Nov 12 2018 at 23:01, on Zulip):

sure, it's just not SysV

eddyb (Nov 12 2018 at 23:01, on Zulip):

heh https://github.com/rust-lang/rust/blob/master/src/librustc_target/abi/call/x86_64.rs#L68

eddyb (Nov 12 2018 at 23:01, on Zulip):

(this does the same thing for both structs and arrays)

eddyb (Nov 12 2018 at 23:01, on Zulip):

where what would matter?

eddyb (Nov 12 2018 at 23:02, on Zulip):

arrays in particular? or struct nesting?

Gankro (Nov 12 2018 at 23:02, on Zulip):

mostly just being paranoid

Gankro (Nov 12 2018 at 23:03, on Zulip):

a vague tingling in the back of my brain from the fact that we concluded we were "supposed" to emit an anonymous struct in the repr of tagged unions

eddyb (Nov 12 2018 at 23:05, on Zulip):

good ABIs don't care about C-isms like struct nesting, but sadly a lot of ABIs are defined in terms of C types, which can lead to some ridiculous behavior

eddyb (Nov 12 2018 at 23:05, on Zulip):

like the two that pass ZSTs by indirection

eddyb (Nov 12 2018 at 23:06, on Zulip):

or the sadness that is #[repr(transparent)] != #[repr(C)]

eddyb (Nov 12 2018 at 23:06, on Zulip):

(AFAIK SysV doesn't make the distinction, which is, again, good)

RalfJ (Nov 13 2018 at 08:01, on Zulip):

@briansmith

I am (very slowly) drafting an RFC for something I'm tentatively calling extern "platform-abi" {} which is mostly like extern "C" except you don't have to use unsafe{} to call such functions, so IMO defining the details of Rust's requirements on the ABI outside of the unsafe code guidelines would be ideal to me.

Interesting. So does this have a typed linker, or how else does it make sure that the extern fn matches the signature? Not to mention all the other guarantees Rust has/might have for its types.
TBH I am somewhat skeptical about weakening unsafe that much.

eddyb (Nov 13 2018 at 09:47, on Zulip):

@briansmith extern "C" is not what makes calling those functions unsafe

eddyb (Nov 13 2018 at 09:47, on Zulip):

it's orthogonal

eddyb (Nov 13 2018 at 09:47, on Zulip):

everything inside extern "..." { ... } is unsafe to access because Rust has no control over them

eddyb (Nov 13 2018 at 09:48, on Zulip):

if you want you can make a proc macro attribute that replaces an extern {...} "block"/"module" with individual (safe) functions that call through to the FFI ones

eddyb (Nov 13 2018 at 09:49, on Zulip):

e.g. #[safe_wrappers] extern "C" { fn foo(); ... } -> fn foo() { extern "C" { fn foo(); } foo() } ...

eddyb (Nov 13 2018 at 09:49, on Zulip):

a typed linker would be interesting but also significantly more work

Gankro (Nov 13 2018 at 15:24, on Zulip):

ironically we could make unsafe extern fn the syntax for making a safe extern fn

briansmith (Nov 13 2018 at 17:24, on Zulip):

@eddyb "safe wrappers" are already part of my explanation for why adding unsafe extern fn is better than what we have now. In my case, I'm using function pointers to call the functions and so #[inline] doesn't optimize away the overhead.

rkruppe (Nov 13 2018 at 17:34, on Zulip):

extern "C" fn() is already a thing and you can call it safely. to get such a function pointer from external functions declared in an extern "abi" {} block you need to unsafely type pun the unsafe away, but that's what serves as the unchecked assertion that the function signatures really are correct as declared, there's no way around such an unchecked assertion (assuming you don't have typed linkers)

briansmith (Nov 13 2018 at 19:02, on Zulip):

@rkruppe The "safe wrapper" idea shows it doesn't matter.

briansmith (Nov 13 2018 at 19:04, on Zulip):

I think @Gankro's idea of putting unsafe in the extern fn declaration to indicate that something isn't being checked is actually a huge improvement over safe wrappers, and the safe wrappers already work today.

rkruppe (Nov 13 2018 at 19:07, on Zulip):

I don't follow at all. When you write extern "whatever" { fn foo(i32) -> i32; }, there's no way for rustc to ensure that the symbol foo actually refers to a function of that signature, and if it doesn't and it's called, memory safety goes down the drain. that's why calling it has to be unsafe -- for soundness of safe rust.

briansmith (Nov 13 2018 at 19:15, on Zulip):

Rust already lets you call foo(x) without using unsafe; you just have to create a "safe wrapper" that obfuscates the fact that an extern function is being called.

rkruppe (Nov 13 2018 at 19:15, on Zulip):

that safe wrapper contains an unsafe block. that unsafe block serves as the programmer's assertion that the imported function is safe to call.

briansmith (Nov 13 2018 at 19:16, on Zulip):

Right, and unsafe extern would serve the same purpose.

briansmith (Nov 13 2018 at 19:16, on Zulip):

Excactly the same amount of unsafe but without the overhead.

rkruppe (Nov 13 2018 at 19:16, on Zulip):

Ok I see where this is going now

rkruppe (Nov 13 2018 at 19:18, on Zulip):

This choice of syntax would be quite unfortunate because unsafe on function declarations means something different from unsafe blocks (has proof obligations vs claiming those are met). this is a pre-existing problem but this would exacerbate the problem because it looks a lot like unsafe fn but that's exactly the wrong way around.

briansmith (Nov 13 2018 at 19:21, on Zulip):

Yes, that's a good point too. I think some people will insist that unsafe has to appear in there, following your argument above, but maybe it doesn't have to appear as unsafe extern or extern unsafe fn but somewhere else.

briansmith (Nov 13 2018 at 19:21, on Zulip):

The weirdness is already there with unsafe impl, isn't it?

rkruppe (Nov 13 2018 at 19:22, on Zulip):

unsafe impl at least naturally mirrors unsafe trait, and if i had to choose then i would pair up trait {decl, impl} with function {decl, call} as "trait impl ~ fn call and trait decl ~ function decl"

rkruppe (Nov 13 2018 at 19:23, on Zulip):

but also yeah as I said this is not a new problem, I just want to avoid making it even worse

briansmith (Nov 13 2018 at 19:43, on Zulip):

One option would be to put the unsafe in #[link(unsafe)] and then allow calling any extern function without unsafe {} if it has such an attribute, unless it is also marked unsafe extern. This would make the syntax consistent and also would call out the unsafety lies in the linking.

Nicole Mazzuca (Nov 13 2018 at 21:04, on Zulip):

why not something like

unsafe extern "C" { ... }
briansmith (Nov 13 2018 at 21:31, on Zulip):

@Nicole Mazzuca I'm not sure if you can see the conversation beforehand, but people expressed some concern that unsafe extern fn would mean the opposite of unsafe fn which is wierd.

Nicole Mazzuca (Nov 13 2018 at 21:39, on Zulip):

@briansmith

extern "C" fn foo1() {}
extern "C" {
  fn bar1();
}

unsafe extern "C" fn foo2() {}
unsafe extern "C" {
  fn bar2();
}

fn main() {
  let _ : extern "C" fn() = foo1;
  let _ : unsafe extern "C" fn() = foo2;

  let _ : unsafe extern "C" fn() = bar1;
  let _ : extern "C" fn() = bar2;
}
Nicole Mazzuca (Nov 13 2018 at 21:40, on Zulip):

this is, without bar2, how the typing rules work today

Nicole Mazzuca (Nov 13 2018 at 21:40, on Zulip):

bar2 is my suggestion

RalfJ (Nov 14 2018 at 09:01, on Zulip):

This choice of syntax would be quite unfortunate because unsafe on function declarations means something different from unsafe blocks (has proof obligations vs claiming those are met). this is a pre-existing problem but this would exacerbate the problem because it looks a lot like unsafe fn but that's exactly the wrong way around.

Shameless plug: Have you seen https://github.com/rust-lang/rfcs/pull/2585 ?

rkruppe (Nov 14 2018 at 09:24, on Zulip):

I have, and I would have loved that change pre-1.0, but idk how to do it now without excessive churn so I didn't say anything instead of repeating what others have said on the issue

Nicole Mazzuca (Nov 14 2018 at 18:25, on Zulip):

@RalfJ This doesn't make sense to me. unsafe extern fn is _absolutely_ not what I'm suggesting, because unsafe extern fn already has a meaning - define a function with extern ABI, which is unsafe

Nicole Mazzuca (Nov 14 2018 at 18:25, on Zulip):

what I'm suggesting is unsafe extern "C" { ... }

Nicole Mazzuca (Nov 14 2018 at 18:26, on Zulip):

note: _not on the function declaration_

gnzlbg (Nov 15 2018 at 09:26, on Zulip):

@briansmith how bad is the problem of having to write a safe thin wrapper about a unsafe extern C function is ? is it a problem worth solving at all? Is it worth solving it right now? even if it was, I don't think the unsafe-code-guidelines are the place to solve it (an RFC repo issue would probably be the place to start supposing I understood the issue correcty: allowing a way to declare extern "C" functions that are safe to call)

briansmith (Nov 16 2018 at 22:23, on Zulip):

@gnzlbg It isn't the worst or most urgent problem I'm facing, TBH. I will revisit it after I deal with other things.

briansmith (Dec 02 2018 at 20:59, on Zulip):

I started writing a no_mangle function and then I realized that, based on the above discussion, the way no_mangle works doesn't make sense. It turns out this is a known issue: https://github.com/rust-lang/rust/issues/28179. So there's a strong inconsistency in terms of whether calling a function with unsafe linkage requires unsafe. This also suggests a workaround: For every "safe extern" function, define an equivalent #[no_mangle] #[inline(never)] Rust function with body { unimplemented!() }; then, during linking, coerce the linker into choosing the non-Rust implementation.

briansmith (Dec 02 2018 at 21:02, on Zulip):

This hack (proposed mostly tongue-in-cheek) wouldn't "solve" all my FFI optimization issues, though; Ideally I'd be able to implement a (safe) trait method with code written in a language which is safer than Rust.

RalfJ (Dec 03 2018 at 19:44, on Zulip):

"safety" is certainly not a total order

RalfJ (Dec 03 2018 at 19:45, on Zulip):

and given that it is a contextual property, it might not be an order at all...

RalfJ (Dec 03 2018 at 19:45, on Zulip):

so "safer than Rust" I don't think helps here

RalfJ (Dec 03 2018 at 19:45, on Zulip):

e.g. if you implement a higher-order function in Java, that one is safe there but it also can assume that all the functions it gets are Java functions, so even if Java was "safer than Rust" you still couldnt safely call this function from Rust

RalfJ (Dec 03 2018 at 19:46, on Zulip):

it might work for first-order functions though? but only if the guarantees actually align. seems like a very special case to me.

RalfJ (Dec 03 2018 at 19:48, on Zulip):

oh I love this one :D

#[no_mangle]
#[allow(non_snake_case)]
pub fn _ZN2io5stdio6_print20h94cd0587c9a534faX3gE() {
    unreachable!()
}
RalfJ (Dec 03 2018 at 19:50, on Zulip):

oh this is even better

#![no_main]

#[link_section=".text"]
#[no_mangle]
pub static main: [u32; 9] = [
    3237986353,
    3355442993,
    120950088,
    822083584,
    252621522,
    1699267333,
    745499756,
    1919899424,
    169960556,
];
briansmith (Dec 04 2018 at 23:45, on Zulip):

@RalfJ By "safer than Rust" I mean a language that makes the same memory safety guarantees as Rust and also provides guarantees not provided by Rust. For example, I would consider Coq to be "safer than Rust" since you can prove your code correct.

briansmith (Dec 04 2018 at 23:46, on Zulip):

My point is that either we need to fix #[no_mangle] to make it safe (at least) or we shouldn't reject "safe extern" just because the linkage is unsafe.

gnzlbg (Dec 05 2018 at 13:05, on Zulip):

Why is fixing #[no_mangle] hard ?

briansmith (Dec 05 2018 at 20:57, on Zulip):

@gnzlbg It's in the issue: https://github.com/rust-lang/rust/issues/28179. Basically fixing it is a breaking change.

Ariel Ben-Yehuda (Dec 05 2018 at 22:28, on Zulip):

I don't think that a safe #[no_mangle] is a desirable end-state

Ariel Ben-Yehuda (Dec 05 2018 at 22:28, on Zulip):

the compiler would have to know the effects of all "special" functions

Ariel Ben-Yehuda (Dec 05 2018 at 22:28, on Zulip):

and of course, other code might look them up

Ariel Ben-Yehuda (Dec 05 2018 at 22:29, on Zulip):

the point of #[no_mangle] is to allow you to interact with the external world. If you wrongly interact with the external world, that can be very easily be UB.

Ariel Ben-Yehuda (Dec 05 2018 at 22:29, on Zulip):

(e.g., consider a C library you are using that has some random symbol with weak linkage)

Ariel Ben-Yehuda (Dec 05 2018 at 22:29, on Zulip):

(and expects it to have a particular type/invariants/whatever)

Ariel Ben-Yehuda (Dec 05 2018 at 22:30, on Zulip):

I think the best option is just forbidding all the "FFI attributes" in #[forbid(unsafe_code)]

gnzlbg (Dec 06 2018 at 08:31, on Zulip):

@briansmith we can start by adding a warning like we did for references to packed struct fields

gnzlbg (Dec 06 2018 at 08:32, on Zulip):

and your suggestions in the issue look fine to me too

Last update: Nov 19 2019 at 18:10UTC