Stream: t-lang

Topic: missing safe abstractions


nikomatsakis (Apr 17 2020 at 19:32, on Zulip):

@Shnatsel I'd be very interested in doing a kind of "review" of the patterns you're seeing that people have to "roll their own"

Shnatsel (Apr 17 2020 at 19:35, on Zulip):

Cool, I guess I need to sit down and make a list then! Or better, enable other people to make it themselves.
One was coming up so frequently I made an RFC to fix it: https://github.com/rust-lang/rfcs/pull/2714

nikomatsakis (Apr 17 2020 at 19:39, on Zulip):

oh dear god I remember that now

nikomatsakis (Apr 17 2020 at 19:40, on Zulip):

I'd like to see that move forward.

nikomatsakis (Apr 17 2020 at 19:40, on Zulip):

interesting.

nikomatsakis (Apr 17 2020 at 19:40, on Zulip):

I was just saying to @Josh Triplett randomly that I think there is a lot of value in enabling people to write "things that used to need unsafe" but without unsafe

nikomatsakis (Apr 17 2020 at 19:41, on Zulip):

this is a point that @Ryan Levick impressed upon me when we were discussing #project-safe-transmute

nikomatsakis (Apr 17 2020 at 19:41, on Zulip):

but I think that this is going to ultimately mean 80% abstractions that can't cover all the cases

nikomatsakis (Apr 17 2020 at 19:42, on Zulip):

I guess that this method fits in that category

Shnatsel (Apr 17 2020 at 19:43, on Zulip):

Yeah, we can't cover all the cases but we can gradually cover more and more of them. from_le_bytes and friends on integers made a huge difference, as did TryFrom stabilization because it enabled feeding subslices to from_*_bytes

Josh Triplett (Apr 17 2020 at 19:43, on Zulip):

Yeah. Though that reminds me...

Josh Triplett (Apr 17 2020 at 19:44, on Zulip):

I would really love to have a trait for to_le_bytes.

Shnatsel (Apr 17 2020 at 19:44, on Zulip):

Another thing that's actually landing in 1.43 and came out of safety-dance is creating a CString from a Vec<NonZeroU8> without scanning the data twice. This enables constructing a CString from a reader without resorting to unsafe to avoid the rescan.

Josh Triplett (Apr 17 2020 at 19:44, on Zulip):

Nice!

Josh Triplett (Apr 17 2020 at 19:44, on Zulip):

The first scan can transmute to note that it's non-zero?

Shnatsel (Apr 17 2020 at 19:46, on Zulip):

Basically you read bytes from a reader until you reach 0, and since you're checking the bytes one by one anyway you might as well construct a NonZeroU8 out of them right away as you put them in a Vec. Then you can get a CString from that Vec without a second scan.

Josh Triplett (Apr 17 2020 at 19:46, on Zulip):

Given const generics, how hard would it be to specify a trait ToEndianBytes that provides to_le_bytes for a type, returning a slice of u8 where the size of the slice depends on the type but it's known to be a slice of u8?

Josh Triplett (Apr 17 2020 at 19:46, on Zulip):

Because then I could rewrite my current macro write_le! as a function rather than a macro.

Josh Triplett (Apr 17 2020 at 19:47, on Zulip):

I don't want to write out w.write_all(&(expr).to_le_bytes())?; hundreds of times, so I have a macro wrapping that: write_le!(w, expr)?;.

Shnatsel (Apr 17 2020 at 19:50, on Zulip):

The latest thing I looked at was rand. I've put an assert on lengths before a hot loop so that the compiler would optimize away the bounds checks and get_unchecked() would not be needed. This seems to be a very valuable but little-known trick. I think we should document it somewhere

Josh Triplett (Apr 17 2020 at 19:51, on Zulip):

Nice!

Josh Triplett (Apr 17 2020 at 19:51, on Zulip):

Where did the value come from right before the loop, that it had a bound but the compiler didn't know that?

Shnatsel (Apr 17 2020 at 19:52, on Zulip):

A surprisingly unsafe-heavy crate is log. It tries to have an atomic integer that actually represents an enum, and conversions to and from that enum are necessarily unsafe and unchecked because they're on the hot path for logging, and they don't want to have unwinding code in there. I think it also hand-rolls a no_std variant of lazy_static inside the crate.

Shnatsel (Apr 17 2020 at 19:52, on Zulip):

@Josh Triplett here's the PR: https://github.com/rust-random/rand/pull/960

Josh Triplett (Apr 17 2020 at 19:54, on Zulip):

Interesting. So, I get why dd is less than 512 (the compiler should really know that the result of % 512 will be less than 512), but what makes cc + 15 < 512? How does the code know that?

Josh Triplett (Apr 17 2020 at 19:54, on Zulip):

Oh, I see the assertion right above that, nevermind.

Josh Triplett (Apr 17 2020 at 19:55, on Zulip):

I wonder how hard it would be to get the compiler to do some numerical reasoning like that?

Josh Triplett (Apr 17 2020 at 19:55, on Zulip):

GCC is capable of some amazing reasoning about bit patterns in a value; you can hand-write things like byte swaps and rotates, and they'll turn into the corresponding instruction.

Shnatsel (Apr 17 2020 at 20:04, on Zulip):

I probably should dust off https://github.com/rust-lang/rfcs/pull/2714, I have an implementation of the current proposed design handy but I'm not sure it's the best design anymore. It's the easiest one to explain, but it doesn't really solve the problem in the fastest possible way, so people might still want to hand-roll a custom implementation. And the fastest possible version is harder to explain, even though it's more general

Shnatsel (Apr 17 2020 at 20:13, on Zulip):

@nikomatsakis I'll look through safety-dance crates and make an issue on a dedicated repo for every currently unavoidable unsafe I find, then send you a link. That way we can have a clear set of problems and it would make sense to start brainstorming solutions. How's that for a plan?

Shnatsel (Apr 17 2020 at 20:16, on Zulip):

Also I've been replacing a ton of ad-hoc unsafe with variants of u16::from_le_bytes([..2].try_into().unwrap()) which is fine, but counter-intuitive. I'm told that const generics might help with the .try_into().unwrap() part.

nikomatsakis (Apr 18 2020 at 10:56, on Zulip):

@Shnatsel that sounds like a good plan

Shnatsel (Apr 18 2020 at 16:22, on Zulip):

Regarding bounds checks: a quick search through crates.io shows that there are currently 10670 invocations of get_unchecked (incl. _mut variant) across 864 crates, and the median number of uses per crate is 3.

nikomatsakis (Apr 20 2020 at 20:02, on Zulip):

Wow, that's kind of astounding.

nikomatsakis (Apr 20 2020 at 20:02, on Zulip):

How many crates use it?

Lokathor (Apr 20 2020 at 23:39, on Zulip):

864 crates

nikomatsakis (Apr 21 2020 at 16:49, on Zulip):

oh heh it's... right there in the message

nikomatsakis (Apr 21 2020 at 16:50, on Zulip):

OK, so that is "3 uses per crate that use get_unchecked, not 3 uses per crate on crates.io or something...

Charles Lew (Apr 22 2020 at 12:05, on Zulip):

About safe abstractions. Someone made a nice (but a little outdated since it's in '18) list here:
https://users.rust-lang.org/t/list-of-crates-that-improves-or-experiments-with-rust-but-may-be-hard-to-find/17806

Last update: Jun 05 2020 at 23:20UTC