Stream: project-deref-patterns

Topic: Deref Patterns, stdlib type list


Connor Horman (Mar 10 2021 at 14:22, on Zulip):

I would like to start the discussion about supported stdlib types for deref patterns.
This needs two lists, a list of requirements for stdlib types to satisfy, and a list of standard library types that do satisfy these requirements.

For the first list, in order to be sound, the deref implementation (and deref_mut) needs to be idempotent. Does it also need to be completely pure (IE., no calls to user-provided code whatsoever), or can we accept a single initial non-pure operation. If the former is the case, then this will disqualify Lazy and SyncLazy from deref patterns, as they are designed to call a user-provided function they first time they are dereferenced. Also, what exactly does idempotent mean, and what operations would reset it?

Connor Horman (Mar 10 2021 at 14:30, on Zulip):

For the second list, it may be reasonable to accept most of the types in the Standard Library that Derefs. The definate exception would be Pin, which can and should participate only if it's type parameter is a participating type. Other possible exceptions are Lazy and SyncLazy due to the fact that they may be initially impure, Cow and AssertUnwindSafe, since these can also be destructured (which is another interesting question), and VaList, which I would argue is useless for DerefPatterns, since you can't really structurally-match VaListImpl.
For the record, the current exhaustive (I believe, please let me know if I'm missing any) list of potential candidates (from https://doc.rust-lang.org/nightly/std/ops/trait.Deref.html#implementors), can be categorized as follows:

Connor Horman (Mar 10 2021 at 14:34, on Zulip):

In my opinion, the minimum I would want would be the Smart Pointer Types, String, and Pin<P> of such a type and of &T and &mut T.
The language reference types can probably be excluded until support for user-defined types is being considered (since presumably they would fall under reference patterns), though if we use a different syntax for deref patterns, it could be reasonable to allow them (it would also keep implementation options open, and not require the implementation to special-special case Pin<&T> or Pin<&mut T> for these).

oliver (Mar 10 2021 at 14:51, on Zulip):

I thought that Lazy and SyncLazy aren't eligible since they can panic, and not Cow since it can clone

Joshua Nelson (Mar 10 2021 at 14:53, on Zulip):

Match arms can already panic in the if-guard

Joshua Nelson (Mar 10 2021 at 14:53, on Zulip):

I don't see what Cow being clonable has to do with it, it doesn't clone in the Deref impl

oliver (Mar 10 2021 at 14:53, on Zulip):

if it's a user-defined clone function

oliver (Mar 10 2021 at 14:55, on Zulip):

oh I see that Cow can qualify for deref patterns, provided a DerefMut impl is not added

oliver (Mar 10 2021 at 14:55, on Zulip):

https://hackmd.io/GBTt4ptjTh219SBhDCPO4A#Possible-solutions

Connor Horman (Mar 10 2021 at 15:59, on Zulip):

oliver said:

I thought that Lazy and SyncLazy aren't eligible since they can panic, and not Cow since it can clone

The potential issue with Lazy and SyncLazy is the fact they are impure the first time they are deferenced.

The issue with Cow is the fact it can also be destructured, and if we supported it, we'd have to decide how to deal with mixing both deref and structural matches.

Connor Horman (Mar 10 2021 at 15:59, on Zulip):

(The issue with Cow is also shared with AssertUnwindSafe)

bstrie (Mar 10 2021 at 17:32, on Zulip):

The potential issue with Lazy and SyncLazy is the fact they are impure the first time they are deferenced.

Is there a concrete example of how this could go wrong? The idempotency here seems pretty solid. Furthermore, assuming Lazy makes it into std, there would be no need to generalize this sort of "you can be impure once" exception to user-defined smart pointers, at least not at first. (I understand the desire to not let std "cheat" by doing things that user code can't do, but there's plenty of precedence (TrustedFoo traits), and furthermore Lazy seems like the only idempotent pointer of this kind that anybody would ever want to make?)

Connor Horman (Mar 10 2021 at 18:01, on Zulip):

bstrie said:

The potential issue with Lazy and SyncLazy is the fact they are impure the first time they are deferenced.

Is there a concrete example of how this could go wrong? The idempotency here seems pretty solid. Furthermore, assuming Lazy makes it into std, there would be no need to generalize this sort of "you can be impure once" exception to user-defined smart pointers, at least not at first. (I understand the desire to not let std "cheat" by doing things that user code can't do, but there's plenty of precedence (TrustedFoo traits), and furthermore Lazy seems like the only idempotent pointer of this kind that anybody would ever want to make?)

One of the reasons that only stdlib types are being persued is because lang-team doesn't necessarily want to introduce impure operations in pattern matching, yet <https://github.com/rust-lang/lang-team/pull/78#issuecomment-780631891>.
Perhaps it is reasonable to special case Lazy and SyncLazy, but it also may not be; if we want an answer to that, we'd need to ask T-lang, probably at a design meeting in the future.

Josh Triplett (Mar 10 2021 at 19:44, on Zulip):

@Connor Horman I was in the meeting that summary came from. I think for both that meeting and that summary, we weren't thinking about types like Lazy, so we were equating "not pure" with "potentially not idempotent". A type like Lazy that runs user code once but then never again, and provides idempotence atop that, seems potentially reasonable.

Josh Triplett (Mar 10 2021 at 19:45, on Zulip):

It's possible that some members of the team may object to code running at pattern-match time for other reasons, but I'm not aware of those reasons.

scottmcm (Mar 10 2021 at 20:44, on Zulip):

Yeah, it's possible that "stable" would be a better word than "pure", like how it's used in http://kimundi.github.io/owning-ref-rs/owning_ref/trait.CloneStableAddress.html

scottmcm (Mar 10 2021 at 20:45, on Zulip):

The one thing that comes to mind for purity is if idempotency is still observable

scottmcm (Mar 10 2021 at 20:46, on Zulip):

Like whether { A(x) if foo(x) => ..., A(deref x) => ... } is allowed to potentially call deref even if the foo predicate is true.

scottmcm (Mar 10 2021 at 20:47, on Zulip):

(But I haven't thought through whether that'd be concerning, nor whether it'd be an implementation concern.)

Joshua Nelson (Mar 10 2021 at 21:08, on Zulip):

I thought match arms were guaranteed to be tried in order? Since the difference is observable due to if-guards?

scottmcm (Mar 10 2021 at 22:38, on Zulip):

Only the if bits need to be in order, though -- it could load all the discriminants and such eagerly today, then do the guard predicates.

(I don't know if the implementation actually takes advantage of that flexibility, though)

Connor Horman (Mar 10 2021 at 22:39, on Zulip):

Josh Triplett said:

Connor Horman I was in the meeting that summary came from. I think for both that meeting and that summary, we weren't thinking about types like Lazy, so we were equating "not pure" with "potentially not idempotent". A type like Lazy that runs user code once but then never again, and provides idempotence atop that, seems potentially reasonable.

Makes sense. So then it wouldn't have an issue from a T-lang perspective?
I still wouldn't necessarily put it on the "No issues at all" list, because even being initially impure could cause issues, and limit certain optimizations.

Connor Horman (Mar 10 2021 at 22:49, on Zulip):

scottmcm said:

Only the if bits need to be in order, though -- it could load all the discriminants and such eagerly today, then do the guard predicates.

(I don't know if the implementation actually takes advantage of that flexibility, though)

It could. Another optimization that's been comtemplated was reducing the number of deref calls (in theory, you could reduce it to merely 0 or 1 deref calls, and some number of deref_mut calls).

Josh Triplett (Mar 11 2021 at 05:07, on Zulip):

@Connor Horman We'd appreciate an assessment of potential practical issues; I'm just suggesting that we're not against it on principle.

simulacrum (Mar 11 2021 at 12:27, on Zulip):

I would also add that I think it would be completely reasonable to pick 1-2 "easy" types, document the restrictions required for the initial design (and ideally how they're compatible with extensions, but this mostly is just to make sure we're not boxing ourselves in unintentionally) - I don't think the initial rfc needs to specify an exhaustive list, though having a set of criteria that can be applied to an arbitrary type may be useful.

For example, I might exclude those types that have user defined code completely for now, and just restrict ourselves to Arc/Rc/Box/String, for example, which all feel much simpler to design a system for over, for example, Lazy.

Connor Horman (Mar 12 2021 at 14:23, on Zulip):

Josh Triplett said:

Connor Horman We'd appreciate an assessment of potential practical issues; I'm just suggesting that we're not against it on principle.

Indeed. And if the project does end up considering it's viability, presumably we'd have an assesment to present, regardless of whether or not it is proposed to be included or not.

simulacrum said:

I would also add that I think it would be completely reasonable to pick 1-2 "easy" types, document the restrictions required for the initial design (and ideally how they're compatible with extensions, but this mostly is just to make sure we're not boxing ourselves in unintentionally) - I don't think the initial rfc needs to specify an exhaustive list, though having a set of criteria that can be applied to an arbitrary type may be useful.

For example, I might exclude those types that have user defined code completely for now, and just restrict ourselves to Arc/Rc/Box/String, for example, which all feel much simpler to design a system for over, for example, Lazy.

That is an option, and a potentially reasonable one. And even if we don't necessarily limit it to that extent, the project could potentially even avoid consideration of Lazy at this time on the basis of it being unstable (dependant on any known timelines for it's stablization).

Connor Horman (Mar 12 2021 at 14:26, on Zulip):

In any case, it may be a good idea to decide whether we want to set the types first, then work out the requirements; set the requirements, then work out viable types; or some combination (such as deciding on a set of "must support" types, working out requirements that don't exclude those, then deciding the rest of the types).

Mario Carneiro (Mar 12 2021 at 15:15, on Zulip):

For the unstable version, why not just support the widest conveniently-supportable set of types? For example everything with a deref impl. This whitelist approach makes sense for the stable version but for unstable it seems like having the feature implemented so people can play with it is more important than restricting it where we aren't sure about what we want the semantics to be

Mario Carneiro (Mar 12 2021 at 15:17, on Zulip):

(If on the other hand for some reason it's easier to support a small list of types rather than all types satisfying some trait or set of constraints, then a whitelist would make sense since it's just an MVP anyway)

Connor Horman (Mar 12 2021 at 18:07, on Zulip):

If you are referring to all types with a deref impl in the standard library, that might be a good way forward. Evaluate the requirements in the wild rather than through theoretical analysis.
If you are referring to all types period, then T-lang has expressed that they do not want to persue that at this time with this project (see relevant discussion on the charter pr, <https://github.com/rust-lang/lang-team/pull/78>).
One of the primary issues is that, if deref patterns are treated as exhaustive, then an impure deref impl can be unsound, as you can write a safe implementation of Deref that returns a different value each time, and that misses the match arms.

Mario Carneiro (Mar 12 2021 at 18:12, on Zulip):

Well I doubt such a thing would be accepted by the compiler anyway, because it has to be desugared somehow

Mario Carneiro (Mar 12 2021 at 18:24, on Zulip):

At the implementation level, it is now a question of whether the desugaring attempts to deduplicate deref calls or not.

  1. If we do deduplicate deref calls, then it has a chance to support exhaustive matching without any regard for idempotence of deref, although that might make it easier to explain to folks for the stable version.
  2. If not, then there is the question of what to do with the leftover matches; if you use an unreachable_unchecked there then yes that is a soundness concern, for user types but also for things like Arc, where we are now relying on a very particular property about the deref call and its interaction with other code that is not represented in any unsafe block.
  3. If you use a panic!("match failed"), then it should be memory safe, and this is probably the easiest thing to implement, although people will definitely want control over what happens in this branch eventually.
Mario Carneiro (Mar 12 2021 at 18:33, on Zulip):

From https://github.com/rust-lang/lang-team/pull/78#issuecomment-780631891 it looks like the idea is to do (2), with the soundness issue being resolved by the whitelist, or possibly DerefPure. Although it looks like DerefPure is being deferred, I think that it makes the design of DerefPure somewhat easy: DerefPure is an unsafe marker trait whose safety condition is "everything needed for the implementation of (2) to be sound", which boils down to some kind of idempotence.

Connor Horman (Mar 14 2021 at 13:11, on Zulip):

Wrt. (1), that is something I'd like to enable, but not necessarily required for exhaustiveness as it's impossible to totally deduplicate it, see this code

let mut x = ManuallyDrop::new(Some(5));
match x{
    &mut Some(3) => {}
    v if v.is_none() => {}
    &Some(5) => {}
    _ => {}
}

That requires 2 deref calls, (three if you count the if guard).

I'd also argue that deref patterns aren't really syntax sugar over an existing construct; they could be closely approximated with, but not actually desugared to, if let guards.
Otherwise, you are right, DerefPure would need to restrict implementors to what makes (2) sound, and possibly more to allow the freedom to deduplicate deref calls. However, as this project is not pursuing DerefPure, this is somewhat moot (though partially relevant to constructing a set of rules for stdlib types.

Dirkjan Ochtman (Mar 15 2021 at 21:04, on Zulip):

In my opinion, the minimum I would want would be the Smart Pointer Types, String, and Pin<P> of such a type and of &T and &mut T.

IMO Vec<T> to [T] is also one of the very important ones.

Connor Horman (Mar 15 2021 at 21:13, on Zulip):

Ah, right, slice patterns work, don't they?

Plecra (Mar 16 2021 at 12:31, on Zulip):

Connor Horman said:

Wrt. (1), that is something I'd like to enable, but not necessarily required for exhaustiveness as it's impossible to totally deduplicate it, see this code

let mut x = ManuallyDrop::new(Some(5));
match x{
    &mut Some(3) => {}
    v if v.is_none() => {}
    &Some(5) => {}
    _ => {}
}

That requires 2 deref calls, (three if you count the if guard).

I'd also argue that deref patterns aren't really syntax sugar over an existing construct; they could be closely approximated with, but not actually desugared to, if let guards.
Otherwise, you are right, DerefPure would need to restrict implementors to what makes (2) sound, and possibly more to allow the freedom to deduplicate deref calls. However, as this project is not pursuing DerefPure, this is somewhat moot (though partially relevant to constructing a set of rules for stdlib types.

Why would this need more than one deref_mut call?

let mut x = ManuallyDrop::new(Some(5));
let mut v = x.deref_mut();
if let &mut Some(3) = v {
} else if v.is_none() {
} else if let &Some(5) = &*v {
} else {}
Plecra (Mar 16 2021 at 12:38, on Zulip):

This confused me in the tracking document too

this case could be implemented as a single deref_mut, then reborrowing. However, this would not work in a case like this:

match &mut x {
    Some(deref v@0..3) => ..., // (1)
    Some(deref mut v@4..7) => ..., // (2)
    v@&None => ..., // (3)
    Some(deref v@8..11) => ..., // (4)
}

Why wouldn't this just be

match v.deref_mut() {
    Some(v @ 0..=3) => {},
    Some(v @ 4..=7) => {},
    v @ None => {},
    Some(v @ 8..=11) => {},
    Some(_) => {} // the original code wasn't exhaustive
}
Plecra (Mar 16 2021 at 12:42, on Zulip):

It seems to me like deref deduplication is easily the best option as long as its viable, so I'm probably missing something important...

Plecra (Mar 16 2021 at 13:13, on Zulip):

This playground demonstrates how I'd expect the examples I've seen so far to work: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=2f5218def15c680a87d5513121befaf7

They're all simple desugarings to existing Rust code that's already exhaustive

Connor Horman (Mar 16 2021 at 13:34, on Zulip):

Plecra said:

This confused me in the tracking document too

this case could be implemented as a single deref_mut, then reborrowing. However, this would not work in a case like this:

match &mut x {
    Some(deref v@0..3) => ..., // (1)
    Some(deref mut v@4..7) => ..., // (2)
    v@&None => ..., // (3)
    Some(deref v@8..11) => ..., // (4)
}

Why wouldn't this just be

match v.deref_mut() {
    Some(v @ 0..=3) => {},
    Some(v @ 4..=7) => {},
    v @ None => {},
    Some(v @ 8..=11) => {},
    Some(_) => {} // the original code wasn't exhaustive
}

The deref type is inside the Option (this can be shown easily by the fact that the deref is indicated inside of Some).
The reason why the deref_mut can't be preemptively used and kept arround for the remaining match arms, is because v@None requires matching the option, thus borrowing it.
This has a number of issues, but primarily, if deref_mut returns an interior pointer (such as the implementation for ManuallyDrop or AssertUnwindSafe), according to Stacked Borrows, the Shared borrow of the Option would invalidate the mutable borrow of that interior pointer (and any pointers derived therefrom). While Stacked Borrows is not normative at this time, my preference would be that Deref Patterns not undermine it. Your desguaring would be correct if the Option was inside the Deref Type (IE. Box<Option> rather than Option<Box>).

Plecra (Mar 16 2021 at 13:38, on Zulip):

Haha, yea I did stumble a bit with grokking that example - the example I gave in the playground with the types the right way around looked like this:

let mut x = Some(core::mem::ManuallyDrop::new(5));
match &mut x {
    Some(ptr) => match ptr.deref_mut() {
        v @ &mut (0..=3) => {
            let v = &*v;
        }
        v @ &mut (4..=7) => {}
        v @ &mut (8..=11) => {
            let v = &*v;
        }
        _ => {}
    },
    v @ None => {
        let v = &*v;
    }
}
Plecra (Mar 16 2021 at 13:38, on Zulip):

(which again, compiles now)

Plecra (Mar 16 2021 at 13:39, on Zulip):

And this does logically reorder the match, which I've been thinking through the consequences of

Connor Horman (Mar 16 2021 at 13:39, on Zulip):

The issue is that you've now changed the order of the match arms (which matters if the arm has an if guard).

Plecra (Mar 16 2021 at 13:40, on Zulip):

Would you mind giving an example of when it would change the behaviour of a guard?

Plecra (Mar 16 2021 at 13:40, on Zulip):

I've been trying to come up with one and my minds blank :P

Plecra (Mar 16 2021 at 13:41, on Zulip):

afaict, the branches that this would "reorder" would be unreachable anyway

Connor Horman (Mar 16 2021 at 13:44, on Zulip):

if guards can have side effects. This particular example wouldn't work, but it's trivial to come up with a case for something like Cow, which can both the dereferenced and destructured.

pub fn check_and_print(x: i32,y: i32) -> bool{
    println!("{}",x);
    x < y
}

let mut c = Cow::Owned(5);
match &mut c{
    x @ deref mut 0..5 => println!("{} in [0,5)",x),
    Cow::Owned(x) if check_and_print(x,5) => println!("Owned {}<5",x),
   x @ deref mut 5..10 if check_and_print(x,7) => println!("{} in [5,10)",x),
    _ => println!("Something else")
}
Connor Horman (Mar 16 2021 at 13:46, on Zulip):

You can't reorder any of those patterns without potentially changing either the side effects in the if guards, or the outcome of the match.

Connor Horman (Mar 16 2021 at 13:49, on Zulip):

(Though actually, you can't deref_mut Cow, but the same argument applies in inverse, since the Cow::Owned pattern mutably borrows x and thus reborrows the entire Cow as mutable, potentially invalidating the deref calls).

Plecra (Mar 16 2021 at 13:54, on Zulip):

So does that mean this only applies to enums that we want to destructure and deref in the same match?

Plecra (Mar 16 2021 at 13:57, on Zulip):

hm.. no I don't think so

Plecra (Mar 16 2021 at 13:57, on Zulip):

this affects all simpler situations where you want to call a method on the pointer type too

Plecra (Mar 16 2021 at 13:58, on Zulip):

Thanks :) That really helps

Connor Horman (Mar 16 2021 at 13:58, on Zulip):

Indeed, as well as cases where you could dereference two embedded things. Result<impl Deref,impl Deref> comes to mind.

Plecra (Mar 16 2021 at 14:01, on Zulip):

I wonder if it'd be worth permitting cases where rustc can implement it with a single deref_mut so as to make the common AST matching cases simple

Plecra (Mar 16 2021 at 14:02, on Zulip):

It'd also be great to have a "realistic" example of when this limitation with multiple dereferences would actually apply for the motivation section of the RFC

Connor Horman (Mar 16 2021 at 14:08, on Zulip):

That seems incredibly limiting, though I can't come up with any specific examples that would be used in the real world.

In any case, the "single deref_mut" restriction shouldn't even be necessary, as (with the exception of Lazy and SyncLazy, which are idempotent), all of the types under consideration for this project are known to have only completely pure (no side effects) deref and deref_mut implementations, so the compiler can insert and remove as many as it wants without changing the behaviour (and while maintaining the exhaustiveness rule).

Plecra (Mar 16 2021 at 14:11, on Zulip):

for sure! and I do think it falls under future considerations, but it'd be nice to leave the door open to a safe API for user-defined derefs

Plecra (Mar 16 2021 at 14:11, on Zulip):

and this seems like a pretty widely-useful subset that could be used safely

Connor Horman (Mar 16 2021 at 14:12, on Zulip):

Indeed, and that may be an additional route to persue for user-defined derefs (though by "single deref_mut call" I assume you are also including cases that can be implemented using a single deref call), though this project is not doing so, as mentioned.

Plecra (Mar 16 2021 at 14:13, on Zulip):

^^ thanks for your time

Connor Horman (Mar 20 2021 at 00:43, on Zulip):

Alright, so we should consider a plan.
I think that we should start with a reduced set, probably, Box, Rc, Arc, String, Vec, and Pin`s of them, and then implement that behind a feature gate (once we have syntax resolved, or at least a decent idea of how we'd do the syntax).
Does this seem like a reasonable plan for now?

Last update: Jul 24 2021 at 20:30UTC