Stream: t-compiler/wg-rls-2.0

Topic: text-size


Christopher Durham (Feb 27 2020 at 03:04, on Zulip):

matklad said:

Ok, let's start small :)

I want to commit to having 1.0 version of text-unit/str-index/text-size crate in three months. To that end:

I'm going to drop impl notes here as I try to find a nice balance between text_unit and str-index.

Christopher Durham (Feb 27 2020 at 03:04, on Zulip):

TextSize is nice, but falls short on one important note: constructing TextRange

Christopher Durham (Feb 27 2020 at 03:05, on Zulip):

Because TextSize is the "size" of a string it feels kind of weird to use as also an index

Christopher Durham (Feb 27 2020 at 03:06, on Zulip):

Though I suppose that's a flown ship (wait wut) given Rust uses usize for both "size" (of an array) and "index"

matklad (Feb 27 2020 at 08:23, on Zulip):

Yeah, exactly: xs[..92] is a range with size as a bound

matklad (Feb 27 2020 at 08:24, on Zulip):

(index is also weird in a sense that it's impossible to index string, only to slice it)

Christopher Durham (Mar 07 2020 at 22:28, on Zulip):

Initial PR for text-size is up!

As I mentioned in the PR, I went pretty aggressive (and probably overboard) on allowing implicit "on-boarding" conversions, via taking Into<TextSize> for most functions. I expect to be vetoed on at least some of them.

The idea of the implicit conversions is to allow using TextSize/TextRange to be as painless as possible. TextRange is actually useful in being a Copy range and providing manipulation methods, but TextSize is "just" a newtype representing a "kind" of u32. The idea is that coersion points makes it so more API surface can adopt TextSize for correctness without regressing ergonomics of using integer literals and such.

Christopher Durham (Mar 08 2020 at 03:44, on Zulip):

per https://github.com/rust-analyzer/text_unit/pull/2#discussion_r389319650, I've removed all of the Into<TextSize> "on-boarding coercions". Reasoning: putting effort into allowing use of literals is "coding for the tests" a bit too much. The point of the crate is to separate text units from raw numbers, so having the split is a good thing.

std::Veetaha (Mar 08 2020 at 08:17, on Zulip):

I guess the thing Rust lacks here is C++ish literals overloads...

Christopher Durham (Mar 08 2020 at 18:34, on Zulip):

@matklad 's position that convinced me the most is that you shouldn't really be using literals of TextSize/TextRange outside of tests. And the API should work to serve the real code first, tests second.

Christopher Durham (Mar 08 2020 at 18:36, on Zulip):

The most common literal to use would be 0, and we can just provide const TextSize::ZERO and/or const fn TextSize::zero() for that case.

matklad (Mar 09 2020 at 07:42, on Zulip):

@Christopher Durham what do you think about https://github.com/rust-analyzer/text_unit/pull/4?

Christopher Durham (Mar 09 2020 at 14:57, on Zulip):

I've forgotten the initial reason that I didn't pull over that requirement. I prefer ::new(start..end) to ::new(start, end) but that's just a minor syntax quibble.

Christopher Durham (Mar 09 2020 at 14:59, on Zulip):

Unlike ops::Range, though, I've already noted a reverse range as immediately a logic error. So the well-formedness of the range is still assumable, just not for the fulfilment of unsafe requirements.

Christopher Durham (Mar 09 2020 at 15:01, on Zulip):

Is there a use case where having a strict guarantee that the range is increasing is required? Or is it just a panic-sooner kind of thing for quicker catching of logic errors?

matklad (Mar 09 2020 at 15:02, on Zulip):

It's for catching logic errors, yeah. And also for letting you not to think about this case as well .

Christopher Durham (Mar 09 2020 at 15:03, on Zulip):

(also I have no idea what tbbs stands for)

matklad (Mar 09 2020 at 15:03, on Zulip):

to be bikeshed

matklad (Mar 09 2020 at 15:07, on Zulip):

So, if we are already at it, I also find TextRange::new(lo..hi) kind of nicer than ,, and probably would even prefer to read/write just TextRange(lo..hi). But isn't that too cute for the sake of code golf? ::new(lo, hi) is boring, which is a major benefit

Christopher Durham (Mar 09 2020 at 15:10, on Zulip):

I think ::new(lo..hi) is probably the best here

Christopher Durham (Mar 09 2020 at 15:10, on Zulip):

The reason I prefer it over (lo, hi) is that it makes the half-open semantics immediately clear

Christopher Durham (Mar 09 2020 at 15:10, on Zulip):

"this works like ops::Range"

matklad (Mar 09 2020 at 15:11, on Zulip):

image.png and, tbh, TextRange::from_to()

image.png

matklad (Mar 09 2020 at 15:11, on Zulip):

image.png

matklad (Mar 09 2020 at 15:11, on Zulip):

, formats better over several lines...

Christopher Durham (Mar 09 2020 at 15:13, on Zulip):

Yeah, that's the one reason to maybe prefer ,.

Christopher Durham (Mar 09 2020 at 15:14, on Zulip):

Designing for a world with argument labels also is different from trying to work well without

matklad (Mar 09 2020 at 15:14, on Zulip):

Note that if it's the frist time you see new(lo..hi), you'll probably go to check docs/source anyway, because "whyyy are we using .. as an argument here? Is it an iterator"?

matklad (Mar 09 2020 at 15:16, on Zulip):

I wonder if

image.png

is best? You get half-opennes and range semantics simply because there's a Rangein the name

matklad (Mar 09 2020 at 15:16, on Zulip):

Anyway, should I merge that PR with invariant?

Christopher Durham (Mar 09 2020 at 15:17, on Zulip):

Maybe, but Range::new(lo..hi) I think is pretty clear. Or we can just ignore the "infallible" expectation of From and do Range::from(lo..hi) if that's clearer (with an inherent method)

Christopher Durham (Mar 09 2020 at 15:17, on Zulip):

But I'm warming up to the "fake tuple struct," actually

Christopher Durham (Mar 09 2020 at 15:18, on Zulip):

I'd say rebase it to add the restriction to the "tuple struct constructor" then merge it

Christopher Durham (Mar 09 2020 at 15:18, on Zulip):

(internal const construction can just fall back to { start, end })

Christopher Durham (Mar 09 2020 at 15:19, on Zulip):

It's sad to see From go but I think it's for the better

matklad (Mar 09 2020 at 15:20, on Zulip):

tbh, I don't think I've ever wanted From for ranges

matklad (Mar 09 2020 at 15:21, on Zulip):

I wanted to do text[text_size..], but that's another story

Christopher Durham (Mar 09 2020 at 15:21, on Zulip):

Yeah, it's more a theoretical sadness than a practical one

Christopher Durham (Mar 09 2020 at 15:21, on Zulip):

And since we're "fake tuple struct"ing TextRange we should probably do it for TextSize as well.

Christopher Durham (Mar 09 2020 at 15:22, on Zulip):

Though to be clear: not marking the constructor as const is a difference from real tuple structs (along with the lack of pattern matching through them).

matklad (Mar 09 2020 at 15:22, on Zulip):

We can make them const tho

matklad (Mar 09 2020 at 15:22, on Zulip):

[()][(start > end) as usize]

Christopher Durham (Mar 09 2020 at 15:22, on Zulip):

Oh no const assert hacks‽ Yes please

matklad (Mar 09 2020 at 15:23, on Zulip):

Though I'd prefer to just put an assert there, and then wait until it is const :D

Christopher Durham (Mar 09 2020 at 15:24, on Zulip):

Please give me const assert! lang team

Christopher Durham (Mar 09 2020 at 15:25, on Zulip):

If adding static_assertions as a dep is ok, they have const_assert! to wrap up the trickery

Christopher Durham (Mar 09 2020 at 15:25, on Zulip):

(and it is a leaf dependency)

Christopher Durham (Mar 09 2020 at 15:26, on Zulip):

Wait nvm I'm pretty sure const_assert! only works on constants, not just in a const context

matklad (Mar 09 2020 at 15:31, on Zulip):

Pushed, but I am having second thoughs about fake tuple for TextSize

matklad (Mar 09 2020 at 15:32, on Zulip):

TextRanges you do create from various offset mansually

matklad (Mar 09 2020 at 15:32, on Zulip):

TextSize, however, you should really crate with of

matklad (Mar 09 2020 at 15:33, on Zulip):

Maybe I am being to pedantic here, but I'd really love to see TextSize::of(":"), rather than TextSize(1), and not having tuple ctor helps with that

matklad (Mar 09 2020 at 15:33, on Zulip):

Ie, you can crate a range from two sizes, but the only "officail" way to create a size would be to measure something

Christopher Durham (Mar 09 2020 at 15:45, on Zulip):

That makes sense

matklad (Mar 09 2020 at 15:47, on Zulip):

Ok, merged that PR

Wojciech Polak (Mar 09 2020 at 15:51, on Zulip):

This const assert hack made my day. Thanks :)

matklad (Mar 09 2020 at 15:52, on Zulip):

@Wojciech Polak https://github.com/rust-analyzer/smol_str/commit/c7ce079a24cdd67eaf972f070874c6be7c166082 :D

std::Veetaha (Mar 09 2020 at 16:13, on Zulip):

We should make a crate out of this

std::Veetaha (Mar 09 2020 at 16:14, on Zulip):

Oh, wait it already exists

Christopher Durham (Mar 09 2020 at 17:50, on Zulip):

Specifically, while I'm pretty sure static_assertions::const_assert only works for statically-known consts, const_fn_assert::cfn_assert works for "const-time" asserts with a not completely horrible const-time error (but understandably bad runtime error).

Wojciech Polak (Mar 11 2020 at 13:04, on Zulip):

I really like these changes, maybe Ill use this text-size 0.99 crate in my next "Parser in Rust" article :)

Christopher Durham (Mar 13 2020 at 01:40, on Zulip):

Just keep in mind that it's not quite stable yet, as we argue out the remaining details :)

Wojciech Polak (Mar 13 2020 at 08:14, on Zulip):

I noticed ;)

Christopher Durham (Mar 19 2020 at 19:36, on Zulip):

I think we're converging on what we want to publish as v0.99, so I published v0.99.0-dev.2 (and also invited matklad to own), so that I can make experimental patches for rowan and rust-analyzer.

Christopher Durham (Mar 20 2020 at 22:59, on Zulip):

Here's a couple papercuts I found with text-size in porting rowan to use it:

Christopher Durham (Mar 20 2020 at 22:59, on Zulip):

<details><summary>Here's the rowan patch</summary>

Testing the fold just in case

</details>

Christopher Durham (Mar 20 2020 at 23:33, on Zulip):

And here's a playground showing off the deref-coersion style(?) TextSized impls

matklad (Mar 20 2020 at 23:48, on Zulip):

I think we should add only the first blanket impls

matklad (Mar 20 2020 at 23:48, on Zulip):

range.conains_range(range) is funny, but is strictly better than the reversed is_subrange

matklad (Mar 20 2020 at 23:51, on Zulip):

Not sure about the last one

matklad (Mar 20 2020 at 23:51, on Zulip):

Seems like empty with argument is the right thing, as there are many empty ranges

Christopher Durham (Mar 20 2020 at 23:51, on Zulip):

empty_at probably is clearer for that case, though

Christopher Durham (Mar 20 2020 at 23:52, on Zulip):

It's a bit of "argument hint" bias to have TextRange::empty(offset: size), but TextRange::empty_at(offset: size) serves those both with and without imho

matklad (Mar 20 2020 at 23:53, on Zulip):

Not sold: like you see there is an argument for empty, and it could only mean one single ting

matklad (Mar 20 2020 at 23:54, on Zulip):

you don't need to know its name to guess what does it mean

Christopher Durham (Mar 20 2020 at 23:55, on Zulip):

The one counterargument is that TextRange::empty(TextSize::zero()) is _also_ common
and I'd guess at it being a decent chunk of ::empty calls

Christopher Durham (Mar 20 2020 at 23:56, on Zulip):

So that would be better served by having ::empty(), rather than relying on ::default() being ::empty()

Christopher Durham (Mar 20 2020 at 23:56, on Zulip):

... which we don't even provide Default currently

matklad (Mar 21 2020 at 07:45, on Zulip):

But ::empty() does not make sense: there are u32::max empty text ranges. Providing default seems like a good idea to me though!

Christopher Durham (Mar 22 2020 at 19:02, on Zulip):

there are u32::MAX empty text ranges

This is what convinced me.

std::Veetaha (Mar 22 2020 at 19:49, on Zulip):

This text range thing is just a vector of two coordinates that has 3 degenerate cases: null_vector(0, 0); unbiased(0, p); empty(p, 0)

std::Veetaha (Mar 22 2020 at 19:59, on Zulip):

I also wonder wouldn't it make sense to std::ops::Range*<TextUnit> instead of TextRange just for the additional syntax sugar?

Christopher Durham (Mar 22 2020 at 20:35, on Zulip):

Three main problems with that:

std::Veetaha (Mar 22 2020 at 21:15, on Zulip):

Hmm, I wish the following worked:

impl Copy for std::ops::Range<TextUnit> { }

but it fails due to coherence rules...
Anyway, yeah that's reasonable

Last update: May 29 2020 at 18:00UTC