Stream: t-cargo/PubGrub

Topic: post 0.2 perf work: small_vec


view this post on Zulip Eh2406 (Nov 17 2020 at 17:22):

So I have had this idea for a while, and gave it a try when I read https://nnethercote.github.io/perf-book/heap-allocations.html#short-vecs
Most Ranges have at most 2 Intervals, so try using a small_vec for storing it. Here is the diff, and I am seeing unreasonably good perf improvements. Like 50%
If one of you has a chance, can you check if you are getting similar improvements?

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 17:27):

Nice! I was wondering about it too but I kind of remember reading somewhere a post about small_vec not being safe (or maybe it was for an alternative of small_vec, don't remember

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 17:29):

https://www.reddit.com/r/rust/comments/fshuhk/introducing_tinyvec_100_safe_alternative_to/

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 17:32):

Similarly, most incompatibilities have two entries, do you know of a tiny_map? :)

view this post on Zulip Alex Tokarev (Nov 17 2020 at 17:37):

I have a similar impression about small_vec having safety issues, and I'd need to verify that since I just vaguely remember as well(

view this post on Zulip Eh2406 (Nov 17 2020 at 17:38):

That was going to be my next comment! Yes small_vec has a history of using unsafe wrong. (https://github.com/RustSec/advisory-db/tree/master/crates/smallvec). It is not know to be broken at the moment, but ....

view this post on Zulip Alex Tokarev (Nov 17 2020 at 17:39):

I also remember mentions of other crates in the space, maybe there are alternatives that would also be good.
However, if they fixed the issues why not give them a chance. Everyone makes mistakes.

view this post on Zulip Eh2406 (Nov 17 2020 at 17:40):

If one of you confirms the perf wins, then I will start looking into alternative implementations.

view this post on Zulip Eh2406 (Nov 17 2020 at 17:41):

I don't know of a tiny_map, but for the limited functionality we need it is not hard to roll our own.

view this post on Zulip Eh2406 (Nov 17 2020 at 17:43):

In fact we don't need much from the vec in Range. One alternative is to roll our own. Small_vec just has such a small diff that it seemed a good place to start measuring the perf.

view this post on Zulip Eh2406 (Nov 17 2020 at 17:45):

cc: https://rust-lang.zulipchat.com/#narrow/stream/146229-wg-secure-code/topic/tinyvec.20without.20default

view this post on Zulip Alex Tokarev (Nov 17 2020 at 17:45):

Eh2406 said:

If one of you confirms the perf wins, then I will start looking into alternative implementations.

I'll run the benchmark shortly. I'm not saying we should disregard small_vecbecause of past issues, but I very much like the idea of exploring available options.

view this post on Zulip Eh2406 (Nov 17 2020 at 17:47):

#[derive(Debug, Clone)]
enum SmallMap<K, V> {
    One([(K, V); 1]),
    Two([(K, V); 2]),
    Flexible(Map<K, V>),
}

Is one way to roll our own tiny_map, a tiny_vec can work similarly.

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 17:48):

the code is not compiling for me in your small_vec branch

view this post on Zulip Eh2406 (Nov 17 2020 at 17:49):

ba humbug. let me see...

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 17:50):

I was using cargo bench --features=serde and got compilations errors about range not be serializable because smallvec

view this post on Zulip Eh2406 (Nov 17 2020 at 17:50):

Right, that is the other thing to fix. cargo bench --features=serde-1

view this post on Zulip Eh2406 (Nov 17 2020 at 17:51):

where serde-1 therns on serde for pubgrub and the serde feature of smallvec.

view this post on Zulip Alex Tokarev (Nov 17 2020 at 17:52):

I just run with '--all-features' and it worked.

view this post on Zulip Alex Tokarev (Nov 17 2020 at 17:52):

change: [-30.169% -28.831% -27.451%] (p = 0.00 < 0.05)

view this post on Zulip Eh2406 (Nov 17 2020 at 17:52):

So not 50% but still big.

view this post on Zulip Alex Tokarev (Nov 17 2020 at 17:53):

There are still a few "Vec::new()" in the code in small_vec branch

view this post on Zulip Alex Tokarev (Nov 17 2020 at 17:53):

Let me see if replacing them makes a difference

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 17:53):

got 13% improvement on the elm dataset and 22% improvement for the synthetic one

view this post on Zulip Alex Tokarev (Nov 17 2020 at 17:56):

There are still a few "Vec::new()" in the code in small_vec branch

Actually maybe that's on purpose, not every Vec in our library does not grow

view this post on Zulip Eh2406 (Nov 17 2020 at 17:57):

So worth pursuing, but not as big as I was seeing. I wonder why the difference, but I will keep playing with this idea. Thank you all.

view this post on Zulip Alex Tokarev (Nov 17 2020 at 18:10):

I see mentions of array_vec that we also might want to try

view this post on Zulip Eh2406 (Nov 17 2020 at 18:10):

that works when we know there can't be more than N elements.

view this post on Zulip Eh2406 (Nov 17 2020 at 21:30):

I am done with meetings, so I can run benchmarks on my version of small vec:
https://github.com/pubgrub-rs/pubgrub/commit/ba5a873d8d188305717127484f86710aabc7240e#diff-fbd4e90bd1c07576edba7f752cccda844f7209ef6f79c25b665bbeceb97c629c
for me it works just as well!

large_cases/elm-packages_str_SemanticVersion.ron
                        time:   [793.53 ms 817.06 ms 840.83 ms]
                        change: [-51.001% -49.380% -47.701%] (p = 0.00 < 0.05)
                        Performance has improved.
large_cases/large_case_u16_NumberVersion.ron
                        time:   [35.874 ms 36.137 ms 36.408 ms]
                        change: [-56.441% -55.656% -54.887%] (p = 0.00 < 0.05)
                        Performance has improved.

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 21:53):

This is awesome if we get that kind of improvements! One remark on the code, it would be better to have Range be a newtype over the custom small vec (like now with Vec inside) instead of have its internals public

view this post on Zulip Eh2406 (Nov 17 2020 at 22:03):

I assume if I am getting the same 50% improvements then you will see the same 13-20% improvements. I don't know why it is so high for me, windows or target-cpu=native or don't know.
Good point I forgot it was publick.

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 22:06):

Maybe I'll get better performances with this than with small vec :) will test probably in the weekend. After v0.2 is published I'll take a few days break at least ^^ and get back to my paid work ahah

view this post on Zulip Matthieu Pizenberg (Nov 17 2020 at 22:07):

I took a week off, it was sweet but it ends now :(

view this post on Zulip Eh2406 (Nov 17 2020 at 23:03):

good call on keeping it private it looks much better.

view this post on Zulip Eh2406 (Nov 17 2020 at 23:03):

https://github.com/pubgrub-rs/pubgrub/commit/c63cee0485ac5336f2eea2fae74ccccc8dbe5140

view this post on Zulip Matthieu Pizenberg (Nov 18 2020 at 11:24):

Just tested the small_vec branch this morning and getting 24% improvement on elm packages and 44% improvement on large case.

view this post on Zulip Eh2406 (Nov 18 2020 at 14:02):

I'll take it!

view this post on Zulip Eh2406 (Nov 18 2020 at 14:17):

I will put a PR up after we get 0.2 up!

view this post on Zulip Alex Tokarev (Nov 18 2020 at 14:42):

Might as well add it now, it's internal API why not

view this post on Zulip Alex Tokarev (Nov 18 2020 at 14:43):

Do you like our own definition more than external crate?

view this post on Zulip Matthieu Pizenberg (Nov 18 2020 at 14:54):

I haven't thought of all the implications of this yet. And there are places where this is half backed, like for example there is a fallback to hashmap for prior cause computation etc. No need to rush this into v02. It's also good to know there will be nice things coming for v03 ^^

view this post on Zulip Alex Tokarev (Nov 18 2020 at 14:56):

I thought this PR is only about Vecs and doesn't attempt to provide a Map.
But yeah, no rush merging it. What I meant to say there is no reason to not put the PR up for review when it's ready.

view this post on Zulip Eh2406 (Nov 18 2020 at 14:57):

I added a commit for a small_map.

view this post on Zulip Eh2406 (Nov 18 2020 at 16:41):

And there are places where this is half backed, like for example there is a fallback to hashmap for prior cause computation etc.

Yes that is half baked, but most incompatibilities come from dependencies so I don't think it needs to be fixed. And it is a perf win as is, so it is worth it even if we don't fix it. But that can get discussed in the PR when it is up.

view this post on Zulip Eh2406 (Nov 21 2020 at 04:05):

just went ahead and fixed it. and the PR is up.

view this post on Zulip Alex Tokarev (Nov 21 2020 at 15:25):

Btw, why didn't we just pull a dep for this in the end?

view this post on Zulip Alex Tokarev (Nov 21 2020 at 17:17):

What do you think of using a dep for a vector implementation but implementing our own map since you said it doesn't exist?

view this post on Zulip Alex Tokarev (Nov 21 2020 at 17:19):

Although I see some crates we can try out as well, e.g. https://lib.rs/crates/smallmap

view this post on Zulip Eh2406 (Nov 21 2020 at 20:18):

Lost some time to a headache today. I will follow up with this when I am feeling better. All the feedback is really good, I am just not up to dealing right now.

view this post on Zulip Eh2406 (Nov 21 2020 at 22:13):

Had a few mins and was able to look into the smallmap crate you linked, and there is a lot I don't get about it. But one think I'm pretty sure of is that it does allocate https://github.com/notflan/smallmap/blob/master/src/lib.rs#L184. I think we are getting the speed up by not allocating when making incompatibilities and not having the cache miss when reading them. So whatever use case that crate is intended for I don't think it is ours.

view this post on Zulip Alex Tokarev (Nov 21 2020 at 22:36):

Yeah I was just a random link; I haven't analyzed applicability of various options, just saw that there were a few.


Last updated: Oct 21 2021 at 21:02 UTC