Stream: t-compiler

Topic: constants in patterns


oli (Apr 03 2020 at 10:14, on Zulip):

TLDR: I'm wondering if we should keep the "lazy constant pattern" approach around, which ensures that we don't expand a const FOO: &[u8] = include_bytes!(super_large_file); if used in a pattern. It has a significant cost to keep around, but saves some compile time for large constants.

While working on https://github.com/rust-lang/rust/pull/70699 I've tried to analyze our entire setup for how constants are used in patterns. We have a very whacky scheme that I've just bolted on more parts over the years. I think we should implement a really thought through scheme once and for all instead of playing whack-a-mole with ICEs and weird "unreachable pattern" bugs. Right now we have both https://github.com/rust-lang/rust/blob/master/src/librustc_mir_build/hair/pattern/const_to_pat.rs#L20 and https://github.com/rust-lang/rust/blob/master/src/librustc_mir_build/hair/pattern/_match.rs#L260 which expand different parts of constants to patterns. Amusingly nowadays both of these are just invoked from https://github.com/rust-lang/rust/blob/master/src/librustc_mir_build/hair/pattern/check_match.rs#L136 right after each other, so merging them will strictly be an improvement.

But even beyond that, we don't actually expand all constants fully into their components. There is/was a worry that if we expanded a const FOO: &[u8] = include_bytes!(some_huge_file); that compilation would get very slow (as we'd create a slice pattern with one sub-pattern per byte in that file. As const eval gets more powerful, other, similarly large constants may exist. Out of that reason we have a lazy way to expand constants only when needed, but that is a rather buggy scheme (whether a const pattern comes before or after a slice pattern makes compilation behave differently) that I don't really grok (as in I understand all parts of it but I have never managed to hold the entire thing in my head at once).

So... one way to go forward would be to just expand all constants to their component patterns. This would be the easiest way, as we have all the support for user-written patterns already existing. It would be a nicely modular way to keep handling of constants out of the pattern match code.
This also gets very problematic in the presence of slices of zsts, as the user can specify a constant with a repeat expression, which creates a very small constant, but expanding it to a full slice pattern gives us a sub-pattern per slice element.

If we want to keep a lazy scheme, everything becomes a bit more complex. The problem is our exhaustiveness checks. The exhaustiveness checks need to look at all fields. Right now they do this in the hybrid system where some things are expanded and some are not. This is also where the ordering problem comes from. If a constant is used before a slice pattern, the constant did not get expanded and was treated as a single unknown value. So even if a following slice pattern had the exact same value as the constant, we treated it as separate and did not emit an unreachable pattern lint. If the constant comes after, we already know that we are looking at slice patterns of certain lengths and check the constant against that. If we want to fix these things correctly, we'll need to fetch the length of const slice patterns without looking at their elements. The current implementation treats constants as non-aggregate (https://github.com/rust-lang/rust/blob/548afdbe1a600ba868ac00acc4e94cca0242b001/src/librustc_mir_build/hair/pattern/_match.rs#L956).

So my question is whether I should put in work into fixing the lazy const scheme we have or make it dead code by expanding all constants into their smallest components and then removing it.

eddyb (Apr 03 2020 at 10:15, on Zulip):

wouldn't it better, for byte slices specifically, to represent their elements as Vec<Pat> | &'tcx [u8] or w/e it is?

eddyb (Apr 03 2020 at 10:15, on Zulip):

wrap that in a type and provide element querying methods

oli (Apr 03 2020 at 10:16, on Zulip):

you can easily generate other huge slice patterns

eddyb (Apr 03 2020 at 10:16, on Zulip):

can you?

oli (Apr 03 2020 at 10:16, on Zulip):

ZST arrays in constants created via repeat expressions are one thing

eddyb (Apr 03 2020 at 10:16, on Zulip):

okay so repeats and byte slices

oli (Apr 03 2020 at 10:16, on Zulip):

but with const eval you can basically generate any huge constant you want at some point

eddyb (Apr 03 2020 at 10:16, on Zulip):

oh I see that's the concern

oli (Apr 03 2020 at 10:17, on Zulip):

and you can convert include_bytes! (which give you an array) to any super huge type of your own

oli (Apr 03 2020 at 10:17, on Zulip):

the last thing works with unions right now

eddyb (Apr 03 2020 at 10:17, on Zulip):

okay but I guess that gets into how we rely on "structural match"

eddyb (Apr 03 2020 at 10:18, on Zulip):

do we use == and assume we don't know the pattern? do we only try to convert the const into a pattern if we can polymorphically evaluate it?

oli (Apr 03 2020 at 10:18, on Zulip):

yes and no

oli (Apr 03 2020 at 10:18, on Zulip):

it's fairly arbitrary

oli (Apr 03 2020 at 10:18, on Zulip):

that's what I'm trying to say, the current status quo is a mess

oli (Apr 03 2020 at 10:19, on Zulip):

sometimes you can break things just by introducing another level of indirection in the type that you match on

oli (Apr 03 2020 at 10:20, on Zulip):

also we error if we can't evaluate a constant

oli (Apr 03 2020 at 10:20, on Zulip):

so generic parameters of any kind are not a concern

oli (Apr 03 2020 at 10:21, on Zulip):

unless we want to support this at some point

oli (Apr 03 2020 at 10:21, on Zulip):

then we do need the lazy approach anyway

varkor (Apr 03 2020 at 11:24, on Zulip):

are people doing things like const FOO: &[u8] = include_bytes!(super_large_file);?

varkor (Apr 03 2020 at 11:24, on Zulip):

this seems like something that ought not to be a problem in practice

oli (Apr 03 2020 at 11:25, on Zulip):

it likely isn't

varkor (Apr 03 2020 at 11:25, on Zulip):

that is, it would be a nice surprise to users if it turned out rustc optimised for it, but there's no good reason it should

varkor (Apr 03 2020 at 11:25, on Zulip):

considering how long pattern matching on consts has been buggy, I think it'd be better to drop the lazy handling and just fix up the existing issues

oli (Apr 03 2020 at 11:25, on Zulip):

we could also warn on constants in patterns that get expanded to more than SOME_LIMIT elements

varkor (Apr 03 2020 at 11:25, on Zulip):

if loading in large consts is the only concern

oli (Apr 03 2020 at 11:26, on Zulip):

it is

varkor (Apr 03 2020 at 11:26, on Zulip):

ah, that's a good idea too

varkor (Apr 03 2020 at 11:26, on Zulip):

it's going to be obvious the compiler is taking a long time to do something if a user accidentally includes something gargantuan too

varkor (Apr 03 2020 at 11:26, on Zulip):

I can only really see this happening by accident

Jonas Schievink (Apr 03 2020 at 11:30, on Zulip):

I've seen a few people do it intentionally

Jonas Schievink (Apr 03 2020 at 11:30, on Zulip):

it seems like a pretty obvious way to bundle large blobs with an application

bjorn3 (Apr 03 2020 at 11:30, on Zulip):

In priroda I did it intentionally to include all resources in a static bundle: https://github.com/oli-obk/priroda/blob/753b539752b7267ff76bea7366035cc1b97ea789/src/main.rs#L217-L241

eddyb (Apr 03 2020 at 11:30, on Zulip):

yeah but not used for patterns

Jonas Schievink (Apr 03 2020 at 11:32, on Zulip):

It's not just slow in patterns

Jonas Schievink (Apr 03 2020 at 11:32, on Zulip):

Or, well, was

centril (Apr 03 2020 at 20:17, on Zulip):

oli said:

so generic parameters of any kind are not a concern

Do you mean generic constants here, e.g., const Foo<T>: u8 = expr;?

centril (Apr 03 2020 at 20:18, on Zulip):

Presumably those would be handled as associated constants are, namely, they result in an error

oli (Apr 04 2020 at 10:30, on Zulip):

No I mean both associated constants and const generic parameters of the current function

oli (Apr 04 2020 at 10:30, on Zulip):

But in the RFC for https://github.com/rust-lang/rust/issues/31434 there's talk about allowing such constants

centril (Apr 04 2020 at 14:10, on Zulip):

Ah; If associated or generic constants would be mentionable in patterns, then they'd need to be some sort of opaque pattern; but that's I think something that definitely would need an RFC

centril (Apr 04 2020 at 14:10, on Zulip):

@oli so I wouldn't worry about it this year

varkor (Apr 06 2020 at 12:59, on Zulip):

so far, all const generics are disallowed in patterns, so it's something we can not think about entirely for now

Last update: Jun 04 2020 at 18:30UTC