Unified (userland) Memory Management I [long]
Posted: Wed Apr 27, 2016 3:18 pm
I'm working in an embedded/appliance market on custom hardware.
So, no constraints as to API's, porting legacy code, etc. The
environment is real-time, multithreaded and object-based. The pieces
of memory being managed are large-ish; the amount of memory available
is significant.
I'm unifying the user interface to manage memory -- no need for
separate API's for partitions/buffer pools vs. dynamic allocation.
These calls apply to a logical address space.
To that end, there are really just 4 interfaces:
exists.
The allocate request requires an indication of the HEAP against which the
request is made. Of course, it must exist, else FAIL. And, a SIZE for the
desired piece of memory. On success, a handle/pointer to the memory
fragment allocated is returned along with the actual size of the fragment.
The caller is expected to remember this handle. (well, if he ever expects
to release the fragment back to the heap!)
Nothing magical, here. And, nothing predictable, either! The caller has
no way of knowing how a request will be satisfied, if it will be satisfied,
how long the operation will take, etc.
To give the caller control over these aspects of the operation, the
allocation request is augmented with a variety of options that govern
the choices made inside the allocator.
[changing focus, for a moment, to expose some key details...]
The release operation takes the handle to a previously allocated fragment
AS RETURNED BY the allocate operation. Plus, the handle to the heap
from which the request was granted. Specifying an invalid handle or
the incorrect heap results in FAIL.
Additionally, the release operation allows the caller to specify how the
fragment should be reintegrated into the "Free List" (maintained for the
specified heap):
opts to WAIT indefinitely, the system will eventually kill it and reclaim its
resources. The TIMEOUT option gives the thread an opportunity to
regain control before that happens and, presumably, effect another
strategy.]
It should be obvious that a user can arrange to KNOW what the free list
looks like by carefully releasing fragments and constraining the actions
that the memory manager applies to them. E.g., if it has a set of
many fixed size fragments, it can force the memory manager to leave
these fragments in their current, fixed sizes so future requests
(coincidentally, for exactly these same sizes!) will be easily and
immediately handled -- by ANY fragment that is available (i.e., no
need to search through the free list if they're all the right size!)
As such, this gives you a buffer pool/partition capability using the
same framework (and code) that would otherwise have been used by
the (dynamic) memory manager.
[(sigh) This is getting long; but, if I elide details, someone will surely ask for them!]
The allocate request similarly takes options that govern how the
fragment in the free list is selected -- and processed -- to satisfy the
request.
With these complementary operations, a diligent caller can know how they will be
using a particular heap and structure their requests to minimize the work done
by the memory management subsystem and obtain predictable/deterministic behavior
from it.
The number of options can usually be hidden inside a macro -- as a caller is unlikely
to be changing them from one invocation to the next. This helps hide complexity.
The options provided by the caller can be exploited by the memory management
routines to optimize how it provides its services. E.g., if the caller is always
releasing fragments and reintroducing into the free list by SIZE, then a subsequent
allocate request for FIRST_FIT effectively gives BEST_FIT -- as any fragments
encountered in the free list AFTER the first one large enough to satisfy the
request will be larger than this first one!
So...
Anything I have missed or failed to consider? (I want to be sure the groundwork
is in plce before posing more detailed questions)
With a bit of tinkering with options, you can see how you can create a buffer
pool from a heap (by repeatedly allocating AS_REQUESTED buffers TRIMmed
to that size; then releasing AS_ALLOCATED to the TAIL of the free list --
the HEAD would yield equivalent functionality but execute in longer time
as the next request could be "prematurely" handled by a previously released
buffer -- even though a large portion of the free space is still maintained
as a single large fragment, not yet trimmed by allocations!)
So, no constraints as to API's, porting legacy code, etc. The
environment is real-time, multithreaded and object-based. The pieces
of memory being managed are large-ish; the amount of memory available
is significant.
I'm unifying the user interface to manage memory -- no need for
separate API's for partitions/buffer pools vs. dynamic allocation.
These calls apply to a logical address space.
To that end, there are really just 4 interfaces:
- create_heap
- destroy_heap
- allocate
- release
exists.
The allocate request requires an indication of the HEAP against which the
request is made. Of course, it must exist, else FAIL. And, a SIZE for the
desired piece of memory. On success, a handle/pointer to the memory
fragment allocated is returned along with the actual size of the fragment.
The caller is expected to remember this handle. (well, if he ever expects
to release the fragment back to the heap!)
Nothing magical, here. And, nothing predictable, either! The caller has
no way of knowing how a request will be satisfied, if it will be satisfied,
how long the operation will take, etc.
To give the caller control over these aspects of the operation, the
allocation request is augmented with a variety of options that govern
the choices made inside the allocator.
[changing focus, for a moment, to expose some key details...]
The release operation takes the handle to a previously allocated fragment
AS RETURNED BY the allocate operation. Plus, the handle to the heap
from which the request was granted. Specifying an invalid handle or
the incorrect heap results in FAIL.
Additionally, the release operation allows the caller to specify how the
fragment should be reintegrated into the "Free List" (maintained for the
specified heap):
- it can be placed at the HEAD of the free list, TAIL, inserted before the
first free fragment having an ADDRESS exceeding that of the newly released
fragment or before the first free fragment having a SIZE greater than the
size of the fragment being released - once inserted into the free list, it can be left AS_ALLOCATED as a free-standing
fragment (retaining the size it had "in the wild"); or, if can be merged with
"adjacent" fragments to COALLESCE and (possibly) reduce the number of
fragments in the free list as well as enlarging those.
- the degree of heap sharing between potentially competing threads (i.e.,
allow the request to proceed UNFETTERED as if there are no other competitors
vying for these resources vs. ensuring ATOMIC operation in the presence of
potential competitors) - the extent of blocking tolerated by the request (i.e., return IMMEDIATEly
if FAIL, block in the hope of completing the desired service but no longer
than some specified TIMEOUT interval, WAIT indefinitely)
opts to WAIT indefinitely, the system will eventually kill it and reclaim its
resources. The TIMEOUT option gives the thread an opportunity to
regain control before that happens and, presumably, effect another
strategy.]
It should be obvious that a user can arrange to KNOW what the free list
looks like by carefully releasing fragments and constraining the actions
that the memory manager applies to them. E.g., if it has a set of
many fixed size fragments, it can force the memory manager to leave
these fragments in their current, fixed sizes so future requests
(coincidentally, for exactly these same sizes!) will be easily and
immediately handled -- by ANY fragment that is available (i.e., no
need to search through the free list if they're all the right size!)
As such, this gives you a buffer pool/partition capability using the
same framework (and code) that would otherwise have been used by
the (dynamic) memory manager.
[(sigh) This is getting long; but, if I elide details, someone will surely ask for them!]
The allocate request similarly takes options that govern how the
fragment in the free list is selected -- and processed -- to satisfy the
request.
- fragment selection strategy: FIRST_FIT, LAST_FIT, BEST_FIT, WORST_FIT,
EXACT_FIT, NEVER_FIT (the latter being a way to stress test applications) - allocation trimming: TRIM, UNTRIMMED (i.e., if the fragment is larger than
requested, should it be trimmed to maximize the amount of memory remaining
in the free list?) - free remainder: TAKE_HEAD, TAKE_TAIL (which part of the TRIMmed fragment
should be returned to the caller vs. retained in the free list) - size homogeneity: AS_REQUESTED, POWERS_OF_TWO, LISTED_VALUES,
MULTIPLES (how should the specified SIZE be interpreted in locating a suitable
free fragment)
With these complementary operations, a diligent caller can know how they will be
using a particular heap and structure their requests to minimize the work done
by the memory management subsystem and obtain predictable/deterministic behavior
from it.
The number of options can usually be hidden inside a macro -- as a caller is unlikely
to be changing them from one invocation to the next. This helps hide complexity.
The options provided by the caller can be exploited by the memory management
routines to optimize how it provides its services. E.g., if the caller is always
releasing fragments and reintroducing into the free list by SIZE, then a subsequent
allocate request for FIRST_FIT effectively gives BEST_FIT -- as any fragments
encountered in the free list AFTER the first one large enough to satisfy the
request will be larger than this first one!
So...
Anything I have missed or failed to consider? (I want to be sure the groundwork
is in plce before posing more detailed questions)
With a bit of tinkering with options, you can see how you can create a buffer
pool from a heap (by repeatedly allocating AS_REQUESTED buffers TRIMmed
to that size; then releasing AS_ALLOCATED to the TAIL of the free list --
the HEAD would yield equivalent functionality but execute in longer time
as the next request could be "prematurely" handled by a previously released
buffer -- even though a large portion of the free space is still maintained
as a single large fragment, not yet trimmed by allocations!)