Stream: t-compiler/wg-rls-2.0

Topic: str_index


matklad (Oct 08 2019 at 07:24, on Zulip):

@Brendan Zabarauskas , @Christopher Durham let's move the codespan, text_unit conversation here :-)

matklad (Oct 08 2019 at 07:29, on Zulip):

I have been wanting to make a breaking change to text_unit for some time (to get rid of inconsistent by-ref/by-value methods on TextRange, and of_str occamlism). I'll be totally up to replacing it with a new crate, if that crate has more than one designer (and I am totally willing to help out).

As I've written in discord:

matklad (Oct 08 2019 at 12:09, on Zulip):

Also, @Brendan Zabarauskas what are your thoughts in general about splitting codespan into a indexin bit, and printing & filemap bit?

The indexing part seems to be a micro crate, which is bad. However, at the same time it's also a "vocabulary type" and sits at the interface, and this seems like a good reason to keep it separate

Christopher Durham (Oct 08 2019 at 14:19, on Zulip):

The purpose of the Index/Offset split IIUC is that Offset is a signed type (i64) suitable for arbitrary offsets

Christopher Durham (Oct 08 2019 at 18:07, on Zulip):

I guess the TL;DR of it @Brendan Zabarauskas is that I'd love to help with the move to merge annotate-snippets / codespan-reporting / langauge-reporting / text_unit

Christopher Durham (Oct 08 2019 at 18:08, on Zulip):

Especially if the end result manages to nicely be bring-your-own DB and pluggable formatting engine

matklad (Oct 08 2019 at 18:08, on Zulip):

(and, to clarify, my interest here is to extract a non-generic super-stupid crate with two types for better-than-usize indexing of strings)

Christopher Durham (Oct 08 2019 at 18:09, on Zulip):

Which is definitely part of doing said merging

Christopher Durham (Oct 08 2019 at 18:10, on Zulip):

I did a play.rust-lang sketch of a unified API for StrIndex

matklad (Oct 08 2019 at 18:12, on Zulip):

@Christopher Durham could you push that to a repo? I'd like to add a couple of comments :D

Christopher Durham (Oct 08 2019 at 18:15, on Zulip):

@matklad <https://github.com/CAD97/str-index/commit/b8fbc103fe8b0fadda08d5af94bf36a6603ef04d>

Christopher Durham (Oct 08 2019 at 20:29, on Zulip):

OK, a decent baseline to review: https://github.com/CAD97/str-index/pull/1

Christopher Durham (Oct 09 2019 at 21:27, on Zulip):

Sorry about the billion force-with-lease pushes, I'm trying (maybe too hard) to keep the commit history clean, and having branches PRing to a branch being adjusted is a recipe for a lot of rebasing.

Brendan Zabarauskas (Oct 10 2019 at 00:02, on Zulip):

@Christopher Durham yeah, I've been curious about annotate-snippets too. I like how they completely remove the need for a file DB+indexing type

Brendan Zabarauskas (Oct 10 2019 at 00:02, on Zulip):

That was the direction I was actually hoping to move in with codespan.

Brendan Zabarauskas (Oct 10 2019 at 00:07, on Zulip):

The thing that has been preventing me from going all-in with annotate-snippets is:

- their use of ansi_term - which doesn't allow for injecting a custom coloured writer and relies on global state
- how they try to stick close to rustc's error reporting style - I think we can make better use of box drawing characters, while also gracefully derading to ascii for those who need it

Brendan Zabarauskas (Oct 10 2019 at 00:10, on Zulip):

We _have_ been in discussions to see if we can get agreement for simplifying the ecosystem around codespan/language-reporting/annotate-snippets. So at least that's something!

Christopher Durham (Oct 10 2019 at 00:10, on Zulip):

rustc is moving towards using annotate-snippets IIRC (reminder: merge intension issue)

Brendan Zabarauskas (Oct 10 2019 at 00:10, on Zulip):

Ohhh nice

Christopher Durham (Oct 10 2019 at 00:12, on Zulip):

rustc issue

Brendan Zabarauskas (Oct 10 2019 at 00:12, on Zulip):

Yeah Zibi, Yehuda, and I had a DM discussion on twitter about it a few months ago

Christopher Durham (Oct 10 2019 at 00:13, on Zulip):

So now's definitely the best time to consolidate efforts

Brendan Zabarauskas (Oct 10 2019 at 00:13, on Zulip):

Yeah, tbh I would rather not have to maintain codespan if I don't need to

Christopher Durham (Oct 10 2019 at 00:14, on Zulip):

From the outside looking in it seems like building a new one by taking the good ideas from all three might be the best way forward (but I have I bias towards building things I'll admit)

Brendan Zabarauskas (Oct 10 2019 at 00:15, on Zulip):

Do you like anything about codespan? :sweat_smile:

Christopher Durham (Oct 10 2019 at 00:16, on Zulip):

The name :P

Brendan Zabarauskas (Oct 10 2019 at 00:16, on Zulip):

hehehehe

Brendan Zabarauskas (Oct 10 2019 at 00:17, on Zulip):

I thought language-reporting was a nice name for the reporting side of things

Brendan Zabarauskas (Oct 10 2019 at 00:17, on Zulip):

The other thing I'd like some help on is how to integrate domain-specific diagnostics, like those in LALRPOP

Brendan Zabarauskas (Oct 10 2019 at 00:18, on Zulip):

And also allowing stuff like coloured type diffs and pretty printing, eg. https://github.com/Marwes/pretty.rs/

Brendan Zabarauskas (Oct 10 2019 at 00:19, on Zulip):

I think there's lots of cool scope to push beyond what rustc does. But perhaps if we joined forces we could improve rustc too.

Brendan Zabarauskas (Oct 10 2019 at 00:23, on Zulip):

I had some ideas showing example output from different languages/tools: https://github.com/brendanzab/codespan/issues/1

matklad (Oct 10 2019 at 06:10, on Zulip):

Note that annotate_snippet uses usize for indexing, and I think that's the right approach there. This is purely an sink layer, so optimizing storage with u32 does not makes sense, and because the user only feeds these types in, newtype wrapper benefits are also not that important.

This is in contrast to rowan, which both stores text unites, and is a source of text unites.

Brendan Zabarauskas (Oct 10 2019 at 22:05, on Zulip):

@matklad Yes, I agree. In hindsight I think the sink approach is a better design.

matklad (Oct 11 2019 at 06:04, on Zulip):

@Christopher Durham what are your goals here? I feel that, if the lodestar is "unifying error reporting" crates, then StrIndex might not be on the right path then

Christopher Durham (Oct 11 2019 at 15:42, on Zulip):

@matklad I think that my end goal is in fact "unifying error reporting", that said error reporting framework "seamlessly" translates from the "major" libraries for managing code span references, and that it can "render" to LSP (and no functionality of LSP is lost).

Christopher Durham (Oct 11 2019 at 15:45, on Zulip):

Still, I think the experimentation done with StrIndex here can definitely serve to improve text_unit

Christopher Durham (Oct 11 2019 at 15:48, on Zulip):

Having played with annotate-snippets' cleanup branch a little, I think the "best" way forward seems to be building from there and bridging codespan/language-reporting to annotate-snippets.

Christopher Durham (Oct 11 2019 at 15:49, on Zulip):

The exact point of concretizing spans for snippet annotation though is an interesting question

Christopher Durham (Oct 11 2019 at 15:50, on Zulip):

In a "fork" of annotate-snippets I've delayed Span resolution all the way to printing with a FnMut(Span, &mut dyn WriteColor) -> io::Result<()> such that printing could include syntax highlighting

Christopher Durham (Oct 11 2019 at 16:25, on Zulip):

I'm actually drafting an issue to propose the API surface I discovered with experimentation in said fork to annotate-snippets

Last update: Nov 12 2019 at 16:15UTC