@pnkfelix are you online ?
yeah but i'm in a mtg right now; it should be over i think in 60min
@pnkfelix let me know
okay mtg is over
lets update the topic to point to the issue where we've been talking
(oh, wait, I don't know if such links work for repos outside rust-lang/rust ...)
(((apart from polonius and chalk, according to @davidtwco )))
I can at least point us at the Tracking issue for custom allocators; that is better than nothing. (Even though our dialogue thus far has instead been on https://github.com/alexcrichton/jemallocator/pull/64
so I think I see your point, and I am unsure whether you are right for jemalloc or not
but I see your point in general that a general allocator might want to error from shrink_to_fit, e.g., if it shrinks to a different size-class even if it doesn't move the allocation
xallocx to the best of my knowledge never does that
(i've asked for clarification in jemalloc's gitter)
xallocx never moves an allocation to a different size class when shrinking/growing in place
in particular, when shrinking, the returned size is always >= new_size
right; it just returs the current size, right?
which means it never errors (errors are reported by returned a size smaller than the requested one)
my point is that that will subseuqently casue dealloc to fail
(where "fail" in this case means "leak")
no, it returns the usable_size, and that's the thing, dealloc must work for sizes in [requested, usable_size]
otherwise usable_size would be useless
that is, xallocx, when shrinking, always returns
>= new_size, and the size can be
>= old_size too, and all sizes in [new_size, usable_size] can be used with sdallocx
all sizes in
[new_size, usable_size] can be used with sdallocx ?
[requested_sz, usable_size] range for
Alloc::dealloc is a guarantee of the
Alloc trait too
new_size is in a different (smaller) size class?
that's the scenario that I thought was outlined in https://github.com/rust-lang/rfcs/pull/1398#issuecomment-164988117
that's a good question, it appears to work
"work" as in "does not assert-fail" ?
or "work" as in "does not leak memory" ?
wait, I'll send a PR with a test, so that you can see the code (it doesn't change anything in
shrink_to_fit, just adds a test for it
i.e. the above linked issue comment said that such a scenario would have the deallocation require be silently ignored
... okay. But I'm not 100% sure we can readily check for memory leak scenarios
(we used to have valgrind tests that would catch leaks but I don't know if they are part of what bors requires for things to past)
In any case I definitely think that the documentation for
Alloc trait needs some exposition on this detail
@pnkfelix i've set valgrind in jemallocators CI, but only for memory errors, no leak detection I think, shouldn't be hard to add
i've asked in jemalloc upstream precisely for what happens in this case, whether we can dealloc with size
1 even if
xallocx returns the original size
maybe mark that PR as WIP for now?
@pnkfelix it fails, so indepentendly of what happens upstream, the answer will be that we have to change shrink_in_place implementation
the correct way to do this is to say that an allocation shrinks only if its usable_size shrinks, and otherwise to error
okay. So I take that to mean that I was right to start asking questions here. :smile:
the correct way to do this is to say that an allocation shrinks only if its usable_size shrinks, and otherwise to error
are you suggesting that the client should be forced to query
usable_size after calling
shrink_to_fit?? ... no, you're just saying that's a way to describe the specification of
@pnkfelix no, I think that we have to do two things:
new_size == Layout::size() and return
if the usable_size of the new allocation block is not smaller than the old size, return
okay. So those are changes to the implementation of the
Alloc for jemalloc.
Do you think that the API for the
fn shrink_to_fit in the
Alloc trait needs changing, apart from clarifying its documentation?
yes, the docs of
shrink_in_place already say that the
new_size can be equal to
Layout::size() - in the docs, we have to explain better what
For example, if you allocate some requested size, you can "grow"/"shrink" it in place as much as you want inside the same bin, at least for jemalloc, and
Alloc::usable_size() lets you grow it.
however, when shrinking, while one could shrink inside the same bin, the allocation itself doesn't change, and we consider it an error
that is, grow_in_place inside the same bin returns success, but shrink_in_place inside the same bin returns error because the usable_size of the allocation did not change
And your point is that
shrink_in_place inside the same bin should return
(as in, the subsequent
dealloc call will work)
i don't know, we can't implement this efficiently with jemalloc, and
Alloc:: does not have a method to query the "minimum size of allocations inside this bin"
usable_size returns a lower bound doesn't it? That's meant to represent that minimum size of such allocations; that's why we added that lower bound.
yeah, but it does not have to be accurate
as in, it is typically the requested size, and the largest size is computed e.g. via
sure, if the underlying allocator doesn't give a way to observe the lower bound, then you can't do anything about it. okay.
yeah, there is another fundamental problem with
grow_in_place does not have
grow_in_place lets you specify a size, and you get an allocation >= than new_size, but with
shrink_in_place you also get an allocation >= new_size, but it can be much larger to the point of this not being useful
So the two ways I could imagine to address that are:
1. Change the API of
shrink_in_place to include such info about the resulting size category for the "new" state of the given block
2. Tell clients that they may want to query
usable_size after calling
(and accept that
usable_size doesn't always provide the most precise answers)
((but it should always provide conservatively correct answers...))
in some sense, it is implied that if you can dealloc the new smaller allocation resulting from
shrink_in_place, it will be in the same size class at leat - so this might not be as bad as I put it.
Note also that
xallocx returns the
usable_size, and that we could use this information in
RawVec for example, so that we would have to add
shrink/grow_in_place_excess methods as well. It might be worth it to just change the signatures of these to always return the
I implemented a new
mallocx-like API in jemalloc that always returns Excess for a C++ paper that proposes sized allocation, and this has been merged in jemalloc upstream already - so the whole "should we have _excess variants of all methods" discussion might be worth revisiting once
jemallcator supports this API (there is an open PR about bumping the jemalloc version to the dev branch again)
Cool! Can you provide a pointer to the jemalloc PR? just curious.
i've pushed a commit to fix
shrink_in_place to https://github.com/alexcrichton/jemallocator/pull/92
we'll see if tests pass everywhere now or not
FWIW, the general recommendation is that
Alloc::usable_size is a bad idea and I think we should remove it.
It basically makes it impossible to gather statistics about memory requests. If someone wants to introspect this,
jemallocator::ffi::nallocx and friends can be used, but it shouldn't be a "general purpose tool", and it should not be widely used
okay, reading over the comment you added to https://github.com/alexcrichton/jemallocator/pull/92, I think I finally understand the scenario you are describing
I understand the drawback you describe, in terms of not being able to observe memory requests. My memory of the original motivation (for
Alloc::usable_size) is that some people were concerned about being forced to round-trip requests through the allocator. Fundamentally I think that concern is inherently at odds with a desire to observe memory requests.
i've shared that with the jemalloc devs, maybe they have a better recommendation, but the underlying problem here is that
xallocx is basically
realloc_in_place, which is what we used to have and worked "as expected" for that API, however, when that got adapted to
shrink_in_place, things got messy
To be honest (and this may be obvious to you), I have not been driving the
trait Alloc design for quite some time now; I'm not a stakeholder in whether it has
grow/shrink_in_place. I assume you've already read the comment threads discussing that split.
yes, the split was motivation is good: its possible to provide different paths for both, and
realloc_in_place mixes them inside one function
now that we have some experience with
shrink_in_place, we have learned new things, and just because they don't work "perfectly" with
jemalloc does not mean that's the fault of these APIs
Your mentioning of
_excess methods above does lead me to wonder whether indeed adding the info about
Excess to the
shrink_in_place could be a way to kill two birds with one stone here
if anything, the problem here is that
usable_size < new_size as error while growing, but
usable_size = original_size as error while shrinking
and the error while shrinking is sometimes "success"
so just so I have this right: if I request a
new_size < original_sizeand it happens to be in the same size bin in as
original_size, jemalloc is allowed to return
if original_size is the usable_size of the bin, then yes
I guess we could check that with
original_bin_usable_size = nallocx(original_size)
but if we don't inline jemalloc into Rust, that could make things slightly more expensive (and in rare pathological cases way more expensive) for little win
nallocx is computed at the beginning of
xallocx to find the size class of the original allocation, so if we inline jemalloc into Rust, doing that becomes free
sorry to be pedantic here, but I just want to really make sure I understand
don't worry, what i mentioned might be slightly incorrect
if the original_size is the usable size of the bin, then this is always the case
if the original_size is not the usable_size of the bin, then I don't know what happens
if i request a
new_size < original_size and the resize attempt fails, it will again return
jemalloc's docs don't mention it, and I don't think all code paths do the same thing here
I suppose this is exactly what you said above
So what I take from this is that what jemalloc expects/wants us to do is to not expect to be able to request a size
S and, on success, subsequently always be able to use
S as the size in a subsequent
let's put it 100% clear - if
new_size < original_size is requested and shrinking fails then what
xallocx returns is not documented anywhere.
xallocx always returns an
usable_size as far as I can tell from reading the code and the docs, but it might return the original_size here sometimes as well - obviously, if the original size is the usable size of the bin, then both match, but I don't know if both have to match
but instead, you can request a size
xmallocx will return some size
S1 <= S2 <= original_size), and you are expected to subsequently use
S2 for the dealloc
yes, that's correct
but I don't know if this is intended, because this is the only api where it happens, and only when shrinking
Should we then consider changing the API of
shrink_in_place to return that
in all other APIs, you can use the [requested, usable] range of sizes to deallocate
the problem is when shall we error
well other Allocators would then have the option of erroring
while jemaloc would now always succeed, as you originally suggested
that is, if we return the size that
xallocx returns, when is that value "success" and when is it "error" ? we would just be pushing that to
the crucial detail is the idea of returning the "new" size class to be used
yes, the error condition might not be important here
Indeed, it is pushing the obligation to call dealloc with the correct value (S2) onto the client, rather than letting them assume that S1 will always work
I mean, what can one do with the error of
shrink in place ?
give up and try a different value
or bubble the failure out
try a different value "in place" ?
I'm just spit-balling here
I don't know what best practices would be
yeah, so I could imagine that if these fail, one moves to doing this out-of-place, but realloc works in place if it can
so that use case makes no sense
in the sense that if
shrink_in_place always succeeds (for jemalloc) then realloc will behave in unexpected ways?
Its not like jemalloc couldn't override the implementation of
realloc if that makes sense for jemalloc's impl of
nono, in the sense that if you want to shrink in place, but are ok with shrinking "out of place" if that fails, you can just use
realloc directly instead since it will do exactly that
oh oh oh
no need to first call shrink_in_place, then realloc
yes you are right
but I'm not a good resource to ask about the use cases for this method
so i am going to wait till we clarify with upstream whether
xallocx always returns the usable size of the original bin on failure - if that's the case, we can use
nallocx to be "perfectly" accurate
and then we can fix things, and maybe jemalloc's dev suggest adding apis for these use cases or not, or maybe there is a better way to do all of this
in the sense that you would make
shrink_in_place react to the return value of
original_size by then checking if the requested size falls into the current size class (via
nallocx), and if so, still return
Ok(()) (and if it doesn't fall into the current size class, then return
kind of, I'll update the PR right now under that assumption, because I am pretty confident that if
xallocx does not always return the
usable_size then that's a bug in jemalloc.
I ask because I originally thought you were suggesting that we call
nallocx first, before
xallocx, which would then mean you could always just treat a return value of the
orignal_size as an error, and you wouldn't have to make any queries upstream.
but I'm guessing you don't want to add that overhead to the hot path?
(where its "only" overhead if the code to call
xallocx isn't not inlined and then the two inlined calls to
nallocx are CSE optimized into a single expression...)
Sorry; I really feel like I'm belaboring the same points you already laid out, but I just want to make sure I understand the plan
((which is strange because, as I said above, I'm really not one of the stakeholders here anymore))
@pnkfelix i've pushed a commit that should make it accurate at the cost of an extra
nallocx call in the case in which the allocation is not shrunk
It sounds to me like that should be both safe (in terms of ensuring the subsequent calls to dealloc will be correct) and nearly as fast as what we had before ... we'll just be paying, at most, in code size if the code is insufficiently inlined. Right?
(i mean, the internal cost of the extra
nallocx call sounds negligible compared to the external cost of returning
Ok would have worked!)
probably, i'd rather just get this correct first, I think its not the first time these have been amended, but I also think that we regressed previous bug fixes here when the
GlobalAlloc refactors were merged (things were fixed in jemallocator but not in rust-lang/rust and then rust-lang/rust code was copy-pasted back)
I agree, get it right first is the way to go
Wow I hadn't noticed that the original code used
size >= new_layout.size() instead of
size > new_layout.size() . That's pretty egregious.
Wow I hadn't noticed that the original code used size >= new_layout.size() instead of size > new_layout.size() . That's pretty egregious.
@pnkfelix I think that the code using
realloc_in_place back then might have been correct, but since the split was made both grow/shrink in place have always been at least partially broken. I've just merged a PR to fix
jemalloc devs still have not commented about