Automatic memory management

Programming, for all ages and all languages.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Automatic memory management

Post by Brendan »

Hi,
onlyonemac wrote:
Brendan wrote:The problem is when there's 10 GiB of RAM being wasted and all the disk drives and all the CPUs are idle; where the OS does absolutely nothing to improve "future performance". It should be prefetching. It should also be checking file systems for errors (to avoid the "You can't use your entire computer for 2 frigging hours because all of the file systems haven't been checked for 123 days" idiocy when it is rebooted), and should probably also be optimising disk layout (e.g. de-fragmenting, data de-duplication, compressing files that haven't been used for ages, etc).
File system checks aren't needed if you use a journalling filesystem. I can only assume that your system is mis-configured; since ext4 the startup filesystem checks have been disabled by default.
My main file systems are using ext4. Directly from "/etc/fstab":

Code: Select all

# <fs>                  <mountpoint>    <type>          <opts>          <dump/pass>
/dev/md0                /boot           ext2            noatime         0 2
/dev/md1                /               ext4            noatime         0 1
/dev/md2                /home           ext4            noatime         0 2
/dev/sdc2               /backup         ext3            noatime         0 2
/dev/sdc3               /backup2        ext4            noatime         0 2
/dev/cdrom              /mnt/cdrom      auto            noauto,user     0 0
Note that sometimes (not always) the file system check that happens during boot does find and fix problems in "/" or "/home". Because the computer isn't rebooted often (mostly only when there's a power failure that takes too long for the UPS to cope with); I worry a little about the computer running for 100+ days (between reboots) with nothing checking for problems at all.
onlyonemac wrote:Also de-fragmentation is not necessary because of techniques in the filesystem (such as delayed allocation) to reduce the effects of fragmentation.
No optimisation of any kind is ever "necessary" (it's always just desirable/beneficial).
onlyonemac wrote:The only thing I've observed a Linux system doing when idle is swapping RAM to disk if the amount of free RAM is below the threshold, or swapping it back to RAM if there's enough free RAM. Other than that my Linux systems always keep their hard drives idle when not in use, and frankly to me that's a good thing because it means that the machine's ready to spring into action as soon as I need it to, rather than slowing my use of it down because it's still trying to do a whole pile of other disk activity.
I'd rather have a system that's ready to spring into action quickly (without disk IO) than a system that's ready to lurch into action slowly (because of disk IO that could've/should've been avoided by pre-fetching). When the hard drives are not in use (e.g. everything that should be prefetched has been prefetched, etc) the hard drives will be idle.
onlyonemac wrote:On the other hand, if I walk up to a Windows system that's been sitting idle for the last half an hour, it's always doing so much disk activity that for the first five minutes after sitting down at it I pretty much can't do anything.
That's most likely because of power management (e.g. the OS saving everything to disk so it can enter "sleep mode" and shut down most of the computer; and needing to turn memory and CPUs back on and load RAM contents back from disk when it wakes up). It has nothing at all to do with prefetching or swap in the first place.
onlyonemac wrote:I'd far rather have my hard drive immediately available to load all the files for the web browser than to have it "prefetching" the email client that actually I'm not going to use because I get my email on my phone. It doesn't matter how quickly it stops doing any background activity when I sit down, it's still going to take a while to realise that I'm needing the hard drive's full bandwidth now, and when we're talking about matters of seconds, any length of time is too long for something that maybe only speeds things up once in a while.
a) If you think having to load files from disk because they OS is crap and didn't prefetch them is "better" then you're a deluded fool.

b) If you think a tiny (~5 ms) delay for the absolute worst possible case (OS busy prefetching the wrong thing at the time and has to stop the prefetch first) is noticeable when the OS has to load an application (and its DLLs, resources, etc) from disk (which will cost several orders of magnitude more than that pathetically tiny delay) then you're a deluded fool.

c) If you think that (for the average case, not some rare pathological case) the completely unnoticeable and irrelevant disadvantage (from b) is more important than the massive advantage (from a) then you're a deluded fool.
onlyonemac wrote:
Brendan wrote:Of course this means searching for things to free (which is slower that being told what to free when); so essentially it sacrifices performance for the sake of programmer laziness.
It's not just programmer laziness; memory leaks due to forgetting to free pointers are pretty much non-existent (as are segfaults due to attempting to free a pointer twice, or other undefined behaviour when a pointer is mistakenly freed when it is still needed or without being marked as freed). These are all real issues, a pain to debug and even more of a pain to the end user.
Programmers that aren't being lazy use tools (e.g. valgrind) to detect these problems, and then fix them. It's only programmers that are being lazy (that don't avoid, find or fix these problems) that benefit from garbage collection.

Note that for some software (prototypes/experiments/research, tools that are used once and discarded, etc) where development time is far more important than efficiency; you want programmers to be lazy. Being lazy (if/where appropriate) is a good thing, not a bad thing.


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
onlyonemac
Member
Member
Posts: 1146
Joined: Sat Mar 01, 2014 2:59 pm

Re: Automatic memory management

Post by onlyonemac »

The ultimate problem with prefectching is that it's never going to get everything right. There's just no way that it can. If it could work perfectly, then obviously it's going to speed things up, but most of the time it doesn't work perfectly.

Seriously, if you have such a problem with Linux's lack of prefetching then write your own prefetcher for Linux.

P.S. You don't need filesystem checks on those ext4 filesystems, so set the numbers in the last two columns to "0 0" to skip the check. But frankly you have such a low opinion of Linux that you deserve to wait for a filesystem check every time you boot your computer.
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Automatic memory management

Post by Brendan »

Hi,
onlyonemac wrote:The ultimate problem with prefectching is that it's never going to get everything right. There's just no way that it can. If it could work perfectly, then obviously it's going to speed things up, but most of the time it doesn't work perfectly.
It's a mistake to assume any optimisation (including prefetching) always has to get everything right. To be beneficial, it only needs to be slightly better on average, and anything more than that is a nice bonus.
onlyonemac wrote:But frankly you have such a low opinion of Linux that you deserve to wait for a filesystem check every time you boot your computer.
The fact that (sometimes, not always) the file system check done during boot does find problems is "proof enough for me" that some sort of checking is necessary for long term stability. Of course we could both complain that some sort of checking shouldn't be necessary to begin with; but realistically (unless you can guarantee that hardware is always flawless) some sort of checking is always going to be necessary (even when the software is perfect). Given that some sort of checking is necessary, I'm only complaining that it's an after-thought (implemented as an external checker that can't work in the background and requires downtime) and not built into the design of the file system code itself (that can work in the background and doesn't increase downtime).
onlyonemac wrote:But frankly you have such a low opinion of Linux that you deserve to wait for a filesystem check every time you boot your computer.
For OSs; I deliberately try to find ways to improve on existing things. There are only 2 outcomes - either I can find one or more ways to improve something (and therefore have a low opinion of it), or I fail to find any way to improve something (and therefore have a high opinion of it). There is nothing that I have a high opinion of.

Note that I consider it an OS designer's job to try to find ways to improve on existing things. All OS designers should do this (but not all OS developers are OS designers).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
onlyonemac
Member
Member
Posts: 1146
Joined: Sat Mar 01, 2014 2:59 pm

Re: Automatic memory management

Post by onlyonemac »

Actually the filesystem does have a "check" built in. It's not designed to perform background checking, but it's designed to not require checking. Your filesystems seriously shouldn't be giving errors; I've never had an ext4 filesystem give errors when checked. Similarly, they don't need defragmentation because the filesystem allocates blocks in a way that minimises fragmentation. While background maintenance may, depending on one's point of view and usage patterns, be better than on-demand or scheduled maintenance nothing beats a system that doesn't need any maintenance in the first place.
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing
User avatar
iansjack
Member
Member
Posts: 4725
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Automatic memory management

Post by iansjack »

onlyonemac wrote:Actually the filesystem does have a "check" built in. It's not designed to perform background checking, but it's designed to not require checking.
Kinda makes you wonder why fsck.ext4 exists, doesn't it? ;)
onlyonemac
Member
Member
Posts: 1146
Joined: Sat Mar 01, 2014 2:59 pm

Re: Automatic memory management

Post by onlyonemac »

iansjack wrote:
onlyonemac wrote:Actually the filesystem does have a "check" built in. It's not designed to perform background checking, but it's designed to not require checking.
Kinda makes you wonder why fsck.ext4 exists, doesn't it? ;)
Because it is possible to break the filesystem, such as through buggy software, hardware failure, etc. But the filesystem doesn't break itself through use or through unexpected shutdowns.
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing
User avatar
iansjack
Member
Member
Posts: 4725
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Automatic memory management

Post by iansjack »

So Brendan is correct - filesystem checks may be necessary from time to time. The cause of possible corruption is irrelevant; if it can happen it can happen.
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: Automatic memory management

Post by Brendan »

Hi,
onlyonemac wrote:Actually the filesystem does have a "check" built in. It's not designed to perform background checking, but it's designed to not require checking. Your filesystems seriously shouldn't be giving errors; I've never had an ext4 filesystem give errors when checked. Similarly, they don't need defragmentation because the filesystem allocates blocks in a way that minimises fragmentation. While background maintenance may, depending on one's point of view and usage patterns, be better than on-demand or scheduled maintenance nothing beats a system that doesn't need any maintenance in the first place.
For file systems that actually work and support changing the size of files and/or deleting files after they're created (e.g. excluding file systems like ISO9660 where you can find the optimum disk layout before burning the CD and it stays like that forever); I don't think it's possible to design a file system that:
  • Needs de-fragmentation (rather than just benefiting from de-fragmentation).
  • Never benefits from de-fragmentation of something (either free space, or allocated space, or both).
For a pathological case; consider what happens when you fill the hard drive with 1 KiB files, then delete every second file, then fill up the freed space with 10 MiB files. For this to work at all, you must either allocate the fragmented free space "as is" (causing the 10 MiB files to be fragmented), or de-fragment free space before creating the new files.

For ext4 specifically, a quick google search leads to this question asking if ext4 benefits from de-fragmentation on "http://superuser.com/", which indicates that it is beneficial.

Note that (as far as I can tell) ext4 tries to reduce "allocate space fragmentation" by leaving gaps between files when they're created (so files can use the gaps if/when their size is increased); which increases "free space fragmentation" (and becomes a problem if/when the disk doesn't have much free space left) and also increases seek times on "half empty" file systems (the heads have to be moved further due to the free space between files).

If it is impossible to design a file system that never benefits from de-fragmentation of something (either free space, or allocated space, or both); then there's only 4 alternatives (in order of "worst" to "best"):
  • Do nothing (leave things fragmented) and get "worse than possible" performance
  • Do nothing (leave things fragmented) by default; but provide an external utility that user/admin can use if they take the file system offline
  • Do nothing (leave things fragmented) by default; but provide an external utility that user/admin can use while the file system is in use
  • Have built-in support for optimising/de-fragmenting while the file system is in use (without an external utility).
Linux seems to use the second option. Windows seems to use the third option.

OS X seems to use a mixture - they have an inbuilt feature called "Hot-File-Adaptive-Clustering" that looks like it optimises/de-fragments frequently used files (the last/best option), but this may only work for frequently used files (and might not work for large files or files that are only "slightly fragmented" either) and so it doesn't always avoid the need for additional utilities to de-fragment (the second option).


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
User avatar
iansjack
Member
Member
Posts: 4725
Joined: Sat Mar 31, 2012 3:07 am
Location: Chichester, UK

Re: Automatic memory management

Post by iansjack »

Although it is obviously impossible to create a file system that is free from the effects of fragmentation in all circumstances, it would be possible for the defragmentation to be done on the fly, at the time a file is created or extended, with just enough data being rearranged to allow a reasonable amount of contiguous free space. Whether you prefer the performance hit every time a file is altered or prefer that the work is done while the system is idle, or at a time of the user's choosing, is a design choice. I know which I prefer.
onlyonemac
Member
Member
Posts: 1146
Joined: Sat Mar 01, 2014 2:59 pm

Re: Automatic memory management

Post by onlyonemac »

Of course you *can* defragment an ext4 filesystem, and I believe you can even do this while the filesystem is online (although not as effectively), but the bottom line is that the way it's designed greatly reduces the need for defragmentation. I know for one that defragmenting my ext4 filesystems doesn't improve their performance dramatically like defragmenting a Windows filesystem does, and for that matter before defragmentation most of my files were not split across more than 2 fragments - that's a filesystem that's been used for a year, is almost full, and has had many small files created and deleted. Ultimately, when the OS is only accessing the file requested at any one time, and not a whole bunch of other files, file and free space fragmentation are not going to cause that much of a performance hit.
When you start writing an OS you do the minimum possible to get the x86 processor in a usable state, then you try to get as far away from it as possible.

Syntax checkup:
Wrong: OS's, IRQ's, zero'ing
Right: OSes, IRQs, zeroing
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Automatic memory management

Post by Schol-R-LEA »

Muazzam wrote:Yeah, I knew that. But what's the point of garbage collection at all!
Unfortunately, no one here has properly answered this question, probably because it is not the question Muazzam really wanted to ask - or perhaps no one here wanted to believe that someone writing an OS would ask the question he was intending, which is, "what is dynamic memory management, and why would you want to use it in an application?"

To be fair, heap allocation rarely comes up in assembly language, at least today, as most assembly code is written stick to a combination of stack allocation and static allocation. Using dynamically resized data structures is mainly a high-level language technique - it can be done in an assembly language, of course, assuming that the OS exposes an allocate/free system call (e.g., sbrk in Linux), or provides a library with the equivalent (malloc(), HeapAlloc()), but few have the patience to write such code.

Let me use an explanation I wrote in another forum, with some modifications specific to this topic. I am going to be as detailed and clear as possible, perhaps insultingly so, but I don't have a clear sense of how much you already know so there isn't really much of an alternative. I am aware that you probably know most of this already, but without knowing what you don't know, I can't afford to skip anything.

The first thing you need to know is there are four different ways memory can be allocated to an application program. The first is read-only memory, which is where things like constants and string literals are stored; on protected/long mode systems, trying to change something in read-only memory will result in a protection fault.

The next kind of memory is global memory, where global variables are set up. This is a fixed amount of space given to the program when it starts, and cannot be increased or decreased. Some global memory may be initialized when the program starts, using values that are part of the executable image, but there may be a separate area of global memory (the so-called .bss section) is simply an amount of memory that the loader clears when the program starts - the executable file just stores the size of the .bss section, to avoid having a large blob of useless empty memory in the file.

The third kind of memory is local (or automatic) memory, which is allocated when a function begins and deallocated when it returns. While there are a few different ways that this could be implemented, it is almost invariably done using a stack, a data structure so ubiquitous that x86 processors have built-in hardware support for it, and most other designs have something similar, either as a dedicated register or as a register usage convention.

So, what is the stack, anyway? It is a region of memory which is set aside for temporary values, combined with a stack pointer (a register specifically used for holding a pointer to the current item in the stack), which is used to act as a last-in, first-out data structure, conceptually resembling a stack of dishes or some other small, easily piled items. The top of the stack (usually the lowest used address in the stack, on x86 CPUs - stacks on PCs usually grow downward, for historical reasons - but I digress) is the memory address where the most recent item has been put. When you add something to the stack - something referred to as pushing the item onto the stack - the stack pointer gets incremented to point to the next address, and the item being pushed is copied to the newly-allocated stack location (actually, with the upside-down stacks usually used, it is usually decremented - subtracted by one - but again that's besides the point). When an item is removed - popped off of the stack - the stack pointer is simply decremented (or incremented), and the used memory is simply left as it is, ready to be reused.

Why is this so important that it has special hardware support? First, because it makes it possible to call functions or subroutines without having to hard-code the return address into the called function - when a function call is made, the address of the instruction following the call is pushed onto the stack (implicitly in the case of the x86, by the CALL instruction, explicitly in many RISC CPUs which use a linkage register for the return address) so that when the callee exits, the return operation (RET on the x86) can pop off the return address and jump to the address of the next instruction in the caller.

Second, it allows a the caller to pass the arguments to the callee by pushing them onto the stack, with the result that the callee can manipulate the arguments without worrying about changing them - the copy pushed onto the stack is treated as being 'local' to the callee. Whether the caller or callee has to clean up the stack afterwards is one of the questions that a calling convention has to answer, though most today leave it to the caller.

Third, when combined with a base pointer (also called a frame pointer), the stack makes for an easy way to hold a group of local variables for the callee function, something called an activation record or stack frame. Basically, the idea is that you push the current base pointer onto the stack, then copy the current stack pointer into the base pointer. This now becomes the base address of the activation record (hence the name 'base pointer'). Then the local values are implicitly 'pushed' onto the stack by decrementing the stack pointer to give enough space on the stack for them. To access the local values, you would use an offset from the base pointer, which would point to the location to set aside for it. When the function exits, you then copy the base pointer back to the stack pointer (implicitly popping off the locals), then pop off the old base pointer, restoring the state to the point of the call - if done correctly, the top of the stack should now point to the return address pushed by the calling instruction.

It is a complicated dance, and one which confuses almost everyone when they first encounter it. Stack discipline in assembly code is fairly tricky, but you probably already knew that.

This brings us to the last kind of memory, the heap or dynamic memory. This a region of memory that can be allocated to programs in pieces, at run time, without being automatically reclaimed when the function that allocated it ends (some languages have a special means of reclaiming dynamic memory after it is no longer in use called garbage collection, but that's not relevant here). Most operating systems have a system call to allow the program to request that the system allocate enough memory to hold one or more values of a given type; it returns a pointer to the newly allocated memory space. There usually is a corresponding system call which allows the program to return the allocated memory to the system. In between allocation and release, the memory becomes available for use, so long as there is at least one pointer to the memory block; you can pass the address of that memory around from one pointer to another, treating it as if it were the same as a pointer to any other non-constant memory. However, once the system releases that memory, the pointers to the memory are all invalid, and should not be used - preferably, you should null them after releasing the memory they point to. When the process exits, the operating system (should) release the memory used, even if it wasn't deallocated, but as long as the program is running, the memory will still be allocated until freed.

There are just three rules to follow with dynamic memory: don't lose the starting address of the memory block, or you won't be able to deallocate later; always remember to deallocate it after you are done with it; and don't try to access the memory after it has been deallocated. These rules are trickier to follow than they sound, however, as it is easy to lose track of where the memory is and how you are using it at any given time. Failing to release memory when it is finished is a memory leak, while a pointer that points to a freed block i(or one that is invalid for some other reason) s a wild pointer.

So, if it is so complicated, why bother with dynamic memory? Because you can make arrays of arbitrary size without knowing how big it has to be beforehand. Also, you can use it to allocate structures with pointers in them, which can then point to other dynamically allocated memory, letting you chain the allocated memory into more complex data structures such as lists and trees. It simply wouldn't be practical to make these structures without dynamic allocation.

This brings us to application memory management. Most of the time, programs do not fetch a new block of memory each and every time they need one; the overhead of system calls is too high, and it potentially can slow the whole system down as the OS manages the allocations. So the usual practice is to allocate larger blocks of memory than is actually needed, and have use that as a kind of scratch space for allocating memory from. The allocations and frees are usually wrapped up in a pair of library functions corresponding to the system calls, and the program only calls the OS again if it needs more. The local allocator has to keep track of the local heap memory, but usually this isn't too complicated.

However, this leaves the programmer to keep track of the actual pointers to the memory, which can be very complicated indeed. For this reason, various approaches for automating memory management have been devised, with garbage collection being the most common (though a related form called 'reference counting' is often used as well, and often confused with it). Basically, in garbage collection, the library functions require the program to get all pointer variables from the allocator, and a part of the program - the garbage collector - checks periodically to see if the blocks of allocated memory still have any pointers pointing at them. When a block has no pointers to it, the allocator frees it for the program, without any action on the part of the programmer. Older garbage collectors had to stop the program in order to do this, but some modern multithreaded ones can do so in parallel to the program's other threads.

While GC is most associated with interpreters, it is applicable to compiled code as well - it was originally developed for the original Lisp 1.5 compiler in 1961. It is possible to use a garbage collector as a library and requiring the programmer to follow the GC discipline, but it is far more common for it to be implemented in the language itself. It has a number of other advantages for both security and simplicity, especially in scripting languages - it allows the language translator to abstract away the allocation and deallocation of variable memory entirely - but there are distinct costs to it as well. There are endless variations on garbage collection meant to reduce the overhead of doing the collection, but a certain amount of overhead is inevitable when garbage collection is used.

Needless to say, many of these issues are relevant to memory allocation at the OS level, as well, but there it becomes even more complicated as the OS-devver has to address issues of the specific memory hardware.

I hope that this has made things clearer for you, rather than more obscure. Comments and corrections welcome.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
Muazzam
Member
Member
Posts: 543
Joined: Mon Jun 16, 2014 5:59 am
Location: Shahpur, Layyah, Pakistan

Re: Automatic memory management

Post by Muazzam »

Schol-R-LEA wrote:
Muazzam wrote:Yeah, I knew that. But what's the point of garbage collection at all!
Unfortunately, no one here has properly answered this question, probably because it is not the question Muazzam really wanted to ask - or perhaps no one here wanted to believe that someone writing an OS would ask the question he was intending, which is, "what is dynamic memory management, and why would you want to use it in an application?"

To be fair, heap allocation rarely comes up in assembly language, at least today, as most assembly code is written stick to a combination of stack allocation and static allocation. Using dynamically resized data structures is mainly a high-level language technique - it can be done in an assembly language, of course, assuming that the OS exposes an allocate/free system call (e.g., sbrk in Linux), or provides a library with the equivalent (malloc(), HeapAlloc()), but few have the patience to write such code.

Let me use an explanation I wrote in another forum, with some modifications specific to this topic. I am going to be as detailed and clear as possible, perhaps insultingly so, but I don't have a clear sense of how much you already know so there isn't really much of an alternative. I am aware that you probably know most of this already, but without knowing what you don't know, I can't afford to skip anything.

The first thing you need to know is there are four different ways memory can be allocated to an application program. The first is read-only memory, which is where things like constants and string literals are stored; on protected/long mode systems, trying to change something in read-only memory will result in a protection fault.

The next kind of memory is global memory, where global variables are set up. This is a fixed amount of space given to the program when it starts, and cannot be increased or decreased. Some global memory may be initialized when the program starts, using values that are part of the executable image, but there may be a separate area of global memory (the so-called .bss section) is simply an amount of memory that the loader clears when the program starts - the executable file just stores the size of the .bss section, to avoid having a large blob of useless empty memory in the file.

The third kind of memory is local (or automatic) memory, which is allocated when a function begins and deallocated when it returns. While there are a few different ways that this could be implemented, it is almost invariably done using a stack, a data structure so ubiquitous that x86 processors have built-in hardware support for it, and most other designs have something similar, either as a dedicated register or as a register usage convention.

So, what is the stack, anyway? It is a region of memory which is set aside for temporary values, combined with a stack pointer (a register specifically used for holding a pointer to the current item in the stack), which is used to act as a last-in, first-out data structure, conceptually resembling a stack of dishes or some other small, easily piled items. The top of the stack (usually the lowest used address in the stack, on x86 CPUs - stacks on PCs usually grow downward, for historical reasons - but I digress) is the memory address where the most recent item has been put. When you add something to the stack - something referred to as pushing the item onto the stack - the stack pointer gets incremented to point to the next address, and the item being pushed is copied to the newly-allocated stack location (actually, with the upside-down stacks usually used, it is usually decremented - subtracted by one - but again that's besides the point). When an item is removed - popped off of the stack - the stack pointer is simply decremented (or incremented), and the used memory is simply left as it is, ready to be reused.

Why is this so important that it has special hardware support? First, because it makes it possible to call functions or subroutines without having to hard-code the return address into the called function - when a function call is made, the address of the instruction following the call is pushed onto the stack (implicitly in the case of the x86, by the CALL instruction, explicitly in many RISC CPUs which use a linkage register for the return address) so that when the callee exits, the return operation (RET on the x86) can pop off the return address and jump to the address of the next instruction in the caller.

Second, it allows a the caller to pass the arguments to the callee by pushing them onto the stack, with the result that the callee can manipulate the arguments without worrying about changing them - the copy pushed onto the stack is treated as being 'local' to the callee. Whether the caller or callee has to clean up the stack afterwards is one of the questions that a calling convention has to answer, though most today leave it to the caller.

Third, when combined with a base pointer (also called a frame pointer), the stack makes for an easy way to hold a group of local variables for the callee function, something called an activation record or stack frame. Basically, the idea is that you push the current base pointer onto the stack, then copy the current stack pointer into the base pointer. This now becomes the base address of the activation record (hence the name 'base pointer'). Then the local values are implicitly 'pushed' onto the stack by decrementing the stack pointer to give enough space on the stack for them. To access the local values, you would use an offset from the base pointer, which would point to the location to set aside for it. When the function exits, you then copy the base pointer back to the stack pointer (implicitly popping off the locals), then pop off the old base pointer, restoring the state to the point of the call - if done correctly, the top of the stack should now point to the return address pushed by the calling instruction.

It is a complicated dance, and one which confuses almost everyone when they first encounter it. Stack discipline in assembly code is fairly tricky, but you probably already knew that.

This brings us to the last kind of memory, the heap or dynamic memory. This a region of memory that can be allocated to programs in pieces, at run time, without being automatically reclaimed when the function that allocated it ends (some languages have a special means of reclaiming dynamic memory after it is no longer in use called garbage collection, but that's not relevant here). Most operating systems have a system call to allow the program to request that the system allocate enough memory to hold one or more values of a given type; it returns a pointer to the newly allocated memory space. There usually is a corresponding system call which allows the program to return the allocated memory to the system. In between allocation and release, the memory becomes available for use, so long as there is at least one pointer to the memory block; you can pass the address of that memory around from one pointer to another, treating it as if it were the same as a pointer to any other non-constant memory. However, once the system releases that memory, the pointers to the memory are all invalid, and should not be used - preferably, you should null them after releasing the memory they point to. When the process exits, the operating system (should) release the memory used, even if it wasn't deallocated, but as long as the program is running, the memory will still be allocated until freed.

There are just three rules to follow with dynamic memory: don't lose the starting address of the memory block, or you won't be able to deallocate later; always remember to deallocate it after you are done with it; and don't try to access the memory after it has been deallocated. These rules are trickier to follow than they sound, however, as it is easy to lose track of where the memory is and how you are using it at any given time. Failing to release memory when it is finished is a memory leak, while a pointer that points to a freed block i(or one that is invalid for some other reason) s a wild pointer.

So, if it is so complicated, why bother with dynamic memory? Because you can make arrays of arbitrary size without knowing how big it has to be beforehand. Also, you can use it to allocate structures with pointers in them, which can then point to other dynamically allocated memory, letting you chain the allocated memory into more complex data structures such as lists and trees. It simply wouldn't be practical to make these structures without dynamic allocation.

This brings us to application memory management. Most of the time, programs do not fetch a new block of memory each and every time they need one; the overhead of system calls is too high, and it potentially can slow the whole system down as the OS manages the allocations. So the usual practice is to allocate larger blocks of memory than is actually needed, and have use that as a kind of scratch space for allocating memory from. The allocations and frees are usually wrapped up in a pair of library functions corresponding to the system calls, and the program only calls the OS again if it needs more. The local allocator has to keep track of the local heap memory, but usually this isn't too complicated.

However, this leaves the programmer to keep track of the actual pointers to the memory, which can be very complicated indeed. For this reason, various approaches for automating memory management have been devised, with garbage collection being the most common (though a related form called 'reference counting' is often used as well, and often confused with it). Basically, in garbage collection, the library functions require the program to get all pointer variables from the allocator, and a part of the program - the garbage collector - checks periodically to see if the blocks of allocated memory still have any pointers pointing at them. When a block has no pointers to it, the allocator frees it for the program, without any action on the part of the programmer. Older garbage collectors had to stop the program in order to do this, but some modern multithreaded ones can do so in parallel to the program's other threads.

While GC is most associated with interpreters, it is applicable to compiled code as well - it was originally developed for the original Lisp 1.5 compiler in 1961. It is possible to use a garbage collector as a library and requiring the programmer to follow the GC discipline, but it is far more common for it to be implemented in the language itself. It has a number of other advantages for both security and simplicity, especially in scripting languages - it allows the language translator to abstract away the allocation and deallocation of variable memory entirely - but there are distinct costs to it as well. There are endless variations on garbage collection meant to reduce the overhead of doing the collection, but a certain amount of overhead is inevitable when garbage collection is used.

Needless to say, many of these issues are relevant to memory allocation at the OS level, as well, but there it becomes even more complicated as the OS-devver has to address issues of the specific memory hardware.

I hope that this has made things clearer for you, rather than more obscure. Comments and corrections welcome.
Thanks for your time to write this, it was helpful. And yeah, it's not "insulting" at all.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Automatic memory management

Post by Schol-R-LEA »

At the risk of SSP on my part, you might want to look at these forum posts that explain the way memory is handled on MIPS processors (this was specifically written for the SPIM simulator, but the calling conventions are the same as on a real MIPS CPU). Having something to compare to the x86 may give you a clearer idea of how both work, and as far as I know, it is fairly close to the ARM calling convention, too.

MIPS Assembly Programming
Function calls and stack handling on MIPS
MIPS Assembly stack offset equates
Data strctures in MIPS assembly
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
User avatar
Muazzam
Member
Member
Posts: 543
Joined: Mon Jun 16, 2014 5:59 am
Location: Shahpur, Layyah, Pakistan

Re: Automatic memory management

Post by Muazzam »

Schol-R-LEA wrote:At the risk of SSP on my part, you might want to look at these forum posts that explain the way memory is handled on MIPS processors (this was specifically written for the SPIM simulator, but the calling conventions are the same as on a real MIPS CPU). Having something to compare to the x86 may give you a clearer idea of how both work, and as far as I know, it is fairly close to the ARM calling convention, too.

MIPS Assembly Programming
Function calls and stack handling on MIPS
MIPS Assembly stack offset equates
Data strctures in MIPS assembly
Well, I've a quite good low-level understanding and I know how memory works. But I rarely use high-level language with garbage collection or even C.
User avatar
Schol-R-LEA
Member
Member
Posts: 1925
Joined: Fri Oct 27, 2006 9:42 am
Location: Athens, GA, USA

Re: Automatic memory management

Post by Schol-R-LEA »

Fair enough. I'd still recommend learning at least one of the common calling conventions, though, as they make code interfaces cleaner even in assembly. It is more important in RISC systems, where much of the function call procedure is by convention rather than directly supported by hardware or microcode, so if you are using an ARM based system like Raspberry Pi, it is definitely something you'll need to understand.

EDIT: I just realized I was thinking you had mentioned Raspberry Pi in this thread already, but I guess I was wrong. You mentioned it at some point regarding the self-driving car project, but for some reason I took that to mean you already were using one.
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
Ordo OS Project
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.
Post Reply