I was working on a lexer PR and realized that the parser uses the lexer API in a fairly strange way -- the lexer maintains a cursor structure that't intended to read through the entire input, but no usages ever read more than one token out of it. (They reinitialize it every time)
I assume it's a relic of a different refactoring. I suspect it could even be a mild performance issue since we _reallocate_ the
Cursor for every single token.
Is this on purpose? Should this be cleaned up?
Are you talking about this Cursor: https://github.com/rust-lang/rust/blob/4802f097c86452cd2e09d44e88dbcb8e08266552/src/librustc_lexer/src/cursor.rs#L7-L12 ?
Yeah -- that cursor, and this usage: https://github.com/rust-lang/rust/blob/master/src/librustc_parse/lexer/mod.rs#L121
It is intended to work only for a single token. The benefit of the current interface is that it's easily restartable from any point
(as opposed to an interface, which just gives you a stateful iterator of all tokens in the input)
:+1: makes sense
Cursor is just an impl detail of
rustc_lexer, the real interface is just "give me the first token for this input".
This interface is meaningfully more restricted than "give me an interator of tokens", becuase you can't, for example, count parenthesis in lexer, and that is a good thing, if, for exapmle, you want to incrementally re-lex a substring