Stream: t-lang/wg-unsafe-code-guidelines

Topic: safe low-level string mutations


Tony Arcieri (Nov 14 2018 at 02:20, on Zulip):

Here is a fun case study:https://doc.rust-lang.org/std/primitive.str.html#method.make_ascii_uppercase

Tony Arcieri (Nov 14 2018 at 02:21, on Zulip):

clicking [src] reveals:

Tony Arcieri (Nov 14 2018 at 02:21, on Zulip):
    pub fn make_ascii_uppercase(&mut self) {
        let me = unsafe { self.as_bytes_mut() };
        me.make_ascii_uppercase()
    }
Tony Arcieri (Nov 14 2018 at 02:22, on Zulip):

challenge: provide an safe API that accomplishes the unsafe { self.as_bytes_mut() }; here, but requires no other changes!

Tony Arcieri (Nov 14 2018 at 02:23, on Zulip):

potential inspiration: https://www.reddit.com/r/rust/comments/9wcujb/mutguard_run_code_every_time_data_is_mutably/

Tony Arcieri (Nov 14 2018 at 02:25, on Zulip):

I am wondering about abstractions that temporarily allow data to enter an unsafe state (the String UTF-8 invariant seems like something of an unusual case for unsafe)

Tony Arcieri (Nov 14 2018 at 02:25, on Zulip):

not sure that's actually a good idea

Tony Arcieri (Nov 14 2018 at 02:26, on Zulip):

it would require an unwind safe invariant check and cleanup strategy that's guaranteed to run 100% of the time barring other unsafe interactions, I think

Tony Arcieri (Nov 14 2018 at 02:27, on Zulip):

you could imagine a "safe" MutGuard permitting as_bytes_mut-style access to the interior of a string that, upon detecting the mutated contents of a string are not valid UTF-8, would e.g. clear the string or otherwise return it to a valid UTF-8 state, and then panic

Tony Arcieri (Nov 14 2018 at 02:29, on Zulip):

the risk to me is that String crossing something like an unwind boundary, getting mutated, and somehow the MutGuard doesn't run and its contents are now no longer valid UTF-8

Tony Arcieri (Nov 14 2018 at 02:30, on Zulip):

at the very least the MutGuard needs to catch_unwind when exposing the mutable byte slice interior

Tony Arcieri (Nov 14 2018 at 02:31, on Zulip):

I'll say right away a leaky "safe" abstraction is probably worse than just using unsafe

Tony Arcieri (Nov 14 2018 at 02:31, on Zulip):

so the question is can there be a non-leaky MutGuard for enforcing something like the UTF-8 invariant on String interiors

Gankro (Nov 14 2018 at 03:19, on Zulip):

Yeah this is just the thread-scoped problem. Make the user execute inside a closure so you can construct MutGuard without ever giving it to the user, so you can be sure it doesn't get leaked.

Gankro (Nov 14 2018 at 03:33, on Zulip):

I describe the exact problem here: https://doc.rust-lang.org/nightly/nomicon/leaking.html#threadscopedjoinguard

Gankro (Nov 14 2018 at 03:34, on Zulip):

and you can see the fix here: https://docs.rs/crossbeam/0.3.0/crossbeam/struct.Scope.html#examples

Gankro (Nov 14 2018 at 03:36, on Zulip):

remark that crossbeam::scope requires you to run inside a closure. this is because it basically creates a Vec<JoinGuard> that you can't touch. the scope argument your closure gets is basically &mut Vec<JoinGuard> wrapped up so you can only push

Tony Arcieri (Nov 14 2018 at 04:37, on Zulip):

fun reading there @Gankro

nikomatsakis (Nov 14 2018 at 21:10, on Zulip):

I've lately been thinking about this problem

nikomatsakis (Nov 14 2018 at 21:10, on Zulip):

e.g., there is an unsafe composition danger even in the closure solution

nikomatsakis (Nov 14 2018 at 21:11, on Zulip):

described in this blog post of mine on Observational Equivalence

nikomatsakis (Nov 14 2018 at 21:11, on Zulip):

the tl;dr is that a coroutine library could 'leak' your stack frames

nikomatsakis (Nov 14 2018 at 21:11, on Zulip):

so basically rayon + crossbeam are relying on stack frames being destructed

nikomatsakis (Nov 14 2018 at 21:12, on Zulip):

this is kind of the same problem as the &mut borrow problem that I talk about in my most recent blog post on the sentinel pattern

nikomatsakis (Nov 14 2018 at 21:12, on Zulip):

in particular, if you were to allow a move out from an &mut,

nikomatsakis (Nov 14 2018 at 21:13, on Zulip):

the danger is that -- in the event of panic -- that &mut may be referencing some value V where the owner will either:

nikomatsakis (Nov 14 2018 at 21:13, on Zulip):

the goal with the take_mut crate is to make those things impossible

nikomatsakis (Nov 14 2018 at 21:13, on Zulip):

but it's again assuming some kind of "absolute owner" -- the process

nikomatsakis (Nov 14 2018 at 21:14, on Zulip):

I think the debate about whether one could reasonably have an &mut that references things outside the process basically comes down to the same question in a way: how "high up" in the ownership hierarchy can we reasonable assert control from inside our rust program?

nikomatsakis (Nov 14 2018 at 21:15, on Zulip):

in short, if you want to ensure that some event X occurs, you need to have some kind of "ownership" over the "lifecycle" of all the things that lead up to X

nikomatsakis (Nov 14 2018 at 21:15, on Zulip):

anyway, I'm not 100% sure where these thoughts are going, but there is something tantalizing there

Tony Arcieri (Nov 14 2018 at 23:24, on Zulip):

yeah, wasn't actually recommending the closure solution, just wondering if it could be done in a non-leaky way. and it sounds like that isn't the case

RalfJ (Nov 15 2018 at 07:40, on Zulip):

but it's again assuming some kind of "absolute owner" -- the process

Very losely related: We currently actually have a safe function (on unix) that makes it pretty much impossible to use even ownership (not to speak of borrows) for things outside the current process: Thanks to before_exec, we have fork in safe code. See #39575.

RalfJ (Nov 15 2018 at 07:41, on Zulip):

I am trying to convince alex to make it unsafe, so far unsuccessfully^^

nikomatsakis (Nov 15 2018 at 17:17, on Zulip):

@RalfJ interesting. Ugh.

nikomatsakis (Nov 15 2018 at 17:23, on Zulip):

@RalfJ seems like we should open some kind of issue about this in the UCG repository — I feel like this topic of "cross-process ownership and references" is worth talking out

nikomatsakis (Nov 15 2018 at 17:23, on Zulip):

thoughts?

RalfJ (Nov 15 2018 at 17:28, on Zulip):

Sure. I explained my position in that issue already (maybe a bit too vehemently^^)

Jake Goulding (Nov 16 2018 at 03:02, on Zulip):

cross-process ownership and references

Sounds relevant to the mmap family of things, as well.

Gankro (Nov 16 2018 at 13:23, on Zulip):

@Jake Goulding mmap very bluntly says "this is unsafe, it can't be made safe" and leaves it at that

Gankro (Nov 16 2018 at 13:23, on Zulip):

because anyone can truncate the file being pointed into, and there's literally no non-racey way to detect it

Jake Goulding (Nov 16 2018 at 14:28, on Zulip):

right, which is why "cross-process ownership" seemed relevant, but perhaps it's still too broad

Last update: Nov 19 2019 at 17:35UTC