Stream: t-cargo/PubGrub

Topic: Is scanning incompatibilities efficient?


view this post on Zulip Eh2406 (Mar 10 2021 at 15:21):

I was up last night thinking about whether there are inefficiencies in how we find Incompatibilities that are (Almost)Satisfied.
The logick gets called a lot! Once per change, per implication, per decision. And it currently does a scan of all incompatibilities to find ones that are related to the changed package and then looks to see the relation with the partial_solution. I had lots of thoughts of different data structures to try, but I may have been dreaming so they may not make enough sense to be worth writing down. What I did want to record ( for my reference or in case you had thoughts ) where 3 places where we may be doing more work than needed.

  1. Does it happen that we add the same incompatibility to the history multiple times? For example a depends on b, when first we decide on a we will add it to the history, then if something else depends on b then current_package will be b and a depends on b is still AlmostSatisfied so we will add it again.
  2. If we have an incompatibility with one term, how does it get into the history? Does it successfully get re-added after a backtrack? ( this must work, I just couldn't figure it out when I was half asleep. ) We should make sure that the memory knows not to ask for the range agen after our dependency provider tells us that it is not available.
  3. If an incompatibility is found to be Contradicted by the partial_solution is there any reason to check it again ( until a backtrack happens ).?

view this post on Zulip Eh2406 (Mar 11 2021 at 20:41):

  1. Just had some time to experiment with it. Removing from consideration incompatibilities that are found to be contradicted can be a perf win. Should definitely come back to it when I have more time.

view this post on Zulip Matthieu Pizenberg (Mar 13 2021 at 11:54):

It's definitely an area that can be improved, and one of the ways to explore is storing incompatibilities in a smarter way for access. (A map probably).

view this post on Zulip Matthieu Pizenberg (Mar 13 2021 at 11:55):

  1. I don't remember well enough. I'll have a look at it next week

view this post on Zulip Matthieu Pizenberg (Mar 13 2021 at 11:57):

  1. Is it possible that a currently contradicted incompat becomes almost satisfied after a backtrack and thus will be needed later. I don't remember well enough too.

view this post on Zulip Matthieu Pizenberg (Mar 13 2021 at 12:03):

Sorry o think we are misusing "term" and "incompatibilities" in our conversation. Incompatibilities are always true, so there is no risk of them being removed after z backtrack since backtracking does not touch incompatibilities, just terms in the partial solution

view this post on Zulip Matthieu Pizenberg (Mar 13 2021 at 12:07):

So 2. Is fine since the code that builds the range of potentially valid versions always takes into account all incompatibilities

view this post on Zulip Matthieu Pizenberg (Mar 13 2021 at 12:08):

Then again I can't point you exactly to the related code. I'll try to next week if you don't figure it out before

view this post on Zulip Eh2406 (Mar 13 2021 at 14:23):

No rush, I will not have time to work on this deeply for a few weeks.
I did some experiments and checking for 1. ( if it happens at all ) was found to not to be faster on the elm benchmark. 3. was found to have some ~10% improvements. So worth exploring some day.

Sorry o think we are misusing "term" and "incompatibilities" in our conversation. Incompatibilities are always true, so there is no risk of them being removed after z backtrack since backtracking does not touch incompatibilities, just terms in the partial solution

I meant that the partial solution stores the terms implied by some set of incompatibilities. When we backtrack we are shrinking that set ( by removing decisions and the terms implied by them ). An incompatibility with one term like "no versions available", will be the first to have its term removed as it was only discovered after the most recent decision. But in some sense is always relevant as it does not depend on any decision, it is almost satisfied from the beginning.

view this post on Zulip Eh2406 (Mar 15 2021 at 13:39):

I did some quick analysis of the complexity of the packages in the elm benchmark. And I think it is very heavy on small dependence trees. There are only a handful of cases with >200 incompatibilities. (for reference pubgrub-rs has about 200 dependencies in the lockfile) So it is possible that some things improve our big-O but slow down the elm benchmark.
I am thinking of creating some kind of config file that specifies supsets to use as additional benchmarks. Maybe the most populer, the most complicated, and the ones that error, or something.


Last updated: Oct 21 2021 at 21:20 UTC