Shutdown..

Brendan · Post by **Brendan** » Sun Nov 11, 2007 10:05 am

Hi,

Just an idea, but I've been noticing that shutting down a computer takes far too long - especially if there's applications, etc running. What is the OS doing, and why does it take so long?

I'm thinking when a computer is being turned off, modified documents (if any) may need to be autosaved and some cached file system data may need to be flushed to disk, and nothing else. Most of the time shutdown should happen almost instantly (there's usually very little modified data).

I'm guessing most OSs terminate running processes first and waste a lot of time doing things that aren't necessary (like freeing all memory that the processes were using, terminating threads/processes and cleaning up after them, etc).

For shutdown, the OS should tell the processes that a shutdown is occuring, and then processes should prepare for the shutdown (autosave if necessary and close file handles, but not much else). Each process would tell the OS that it's ready for shutdown, the OS would make the process' threads block (to avoid terminating the threads/processes while making sure the threads don't consume CPU time). Eventually all processes would be ready for shutdown and the OS would turn the power off (without ever freeing any memory used by processes, etc).

To put this in a different way (using POSIX terminology), to avoid wasting time the OS would need to send something like "SIGSHUTDOWN" rather than sending SIGTERM, because SIGTERM causes a lot of unnecessary work.

Any comments?

Cheers,

Brendan

AJ · Post by AJ » Sun Nov 11, 2007 10:28 am

Hi,

I am in absolute agreement with you - all OSes I have used take far too long to shut down, given that you are doing just that - shutting down. Most of them take too long to boot too - but that's a different kettle of fish.

One problem I can see with a signal for shutdown, is that apps programmers would have to be well behaved and use the new signal. If you are suggesting that it could be incorporated in to an existing OS, you have to either send both sets of signals in case app programers haven't written new versions of their software. Alternatively, handling SIGSHUTDOWN becomes mandatory and you get a lot of software which remains unusable on the new version of the OS.

If you are just suggesting this for your own OS, I guess the only disadvantage is that you slightly reduce portability between that and POSIX-type systems.

One thing I have noticed, being a Windows user is that the "Closing Network Connections..." stage of shutdown can also take ages. Perhaps clearing resources isn't the only issue here.

Cheers,
Adam

frank · Post by **frank** » Sun Nov 11, 2007 10:57 am

Yeah I have the problem that even if I close all programs it can still take vista 5 minutes to shutdown. Some of the things that I think vista does when it shuts down are updated the readyboot cache and the readyboost cache. Of course I always hibernate so it usually only takes about 1 - 2 minutes to "shutdown".

On second thought I wouldn't mind my computer taking 10 minutes to shutdown if it booted to a usable desktop in 5 seconds or so, including BIOS time.

Brynet-Inc · Post by **Brynet-Inc** » Sun Nov 11, 2007 11:43 am

On OpenBSD, privileged commands like "reboot" and "shutdown" send a signal to the init(8 ) process..

According to the OpenBSD manual page for init(8 ):

init will terminate multi-user operations, kill all getty(8 ), run /etc/rc.shutdown, and halt the machine if user-defined signal 1 (USR1) or user-defined signal 2 is received. /etc/rc.shutdown can specify that a powerdown is requested. Alternatively, USR2 specifically requests a powerdown.

So all processes are stopped gracefully, file systems are demounted.. and the system stops accepting login requests.

It may seem a bit slow.. but it rarely takes more then a few minutes maximum.

Does the process differ that much on other systems?

Brendan · Post by **Brendan** » Sun Nov 11, 2007 10:00 pm

Hi,

AJ wrote:One problem I can see with a signal for shutdown, is that apps programmers would have to be well behaved and use the new signal. If you are suggesting that it could be incorporated in to an existing OS, you have to either send both sets of signals in case app programers haven't written new versions of their software. Alternatively, handling SIGSHUTDOWN becomes mandatory and you get a lot of software which remains unusable on the new version of the OS.

If you are just suggesting this for your own OS, I guess the only disadvantage is that you slightly reduce portability between that and POSIX-type systems.

I'm thinking of something like this for my own OS, where portability doesn't matter (I already do several things that make porting POSIX applications very difficult). I don't use signals - things like SIGSHUTDOWN and SIGTERM would actually be done with normal messages (I mentioned signals as most people are familair with them).

AJ wrote:One thing I have noticed, being a Windows user is that the "Closing Network Connections..." stage of shutdown can also take ages. Perhaps clearing resources isn't the only issue here.

Is it necessary to close networking connections? For all networking, other computers need to tolerate the sudden disappearance of a computer anyway (network faults, power failures, etc).

Of course some processes may delay shutdown for specific reasons. For example, a text editor might display a "Do you want to save this document" dialog box and wait for the user to reply, and if there's a reason to close some networking connections then that could be done too.

The exact sequence I'm thinking of would go something like this:

1) User selects "shutdown" and the GUI displays a "shutdown, reboot or cancel" dialog box and tells the OS about possible shutdown.

2) Kernel broadcasts a "pre-shutdown warning" message to threads.

3) Threads start sending modified data to disk (but don't close file handles or do anything that can't be reversed).

4) If the user selects "cancel" then nothing else happens. If the user selects "shutdown" or "reboot" the GUI tells the OS to start shutdown. Note: the steps above could be skipped (for e.g. if the user types "shutdown" at the command line, then the command line utility could just tell the OS to start shutdown without any "pre-shutdown warning").

5) Kernel broadcasts a "shutdown requested" message to threads.

6) Threads continue sending modified data to disk (but don't close file handles or do anything that can't be reversed) and do any dialog boxes necessary ("Do you want to save this? [YES | NO | CANCEL]"). Then each thread tells the kernel if it's ready for shutdown, or if shutdown should be aborted (e.g. if the user selected "Cancel" in one of the dialog boxes).

7) If any thread tells the kernel to cancel shutdown then the kernel broadcasts a "shutdown aborted" message to threads, and everything continues running normally. Otherwise, as soon as all threads are ready for shutdown the kernel broadcasts a "shutdown level 1 started" message to all threads. This is the point of no return - there's no way to cancel shutdown past this point.

8) All threads have a "shutdown level". Normal threads have shutdown level 1, file sytems might have shutdown level 2, device drivers might be shutdown level 3, etc. When a thread receives a "shutdown level N started" message it completes sending any modified data to disk, closes file handles and does anything else that needs to be done. Then it calls a kernel API function to tell the kernel it's ready for shutdown. The kernel API function decrements the thread's shutdown level, and if the thread's shutdown level reaches zero the thread is blocked (put into a "waiting for shutdown" state).

9) After all threads have called the kernel API function the kernel broadcasts the next "shutdown level N started" message to any threads that are still running, and these remaining threads repeat step 8. This continues until all threads are in the "blocked waiting for shutdown" state. The idea here is to ensure that (for e.g.) the disk drivers don't stop working until after file system code has stopped, and the file system code doesn't stop working until after applications, etc have stopped.

10) Once all threads are in the "blocked waiting for shutdown" state, the kernel turns the computer off (or resets it).

Note: Throughout all of this there'd need to be time-outs, etc. If a thread doesn't respond quickly enough (e.g. it's in an infinite loop) the kernel forces it to complete the step. In addition a thread would be able to ask for more time. For example, at step 6, if a thread is waiting for the user it'd need to ask for more time once per second, and if a thread doesn't respond within 2 seconds the kernel will tell itself that the thread is ready for shutdown.

Cheers,

Brendan

Avarok · Post by **Avarok** » Sun Nov 11, 2007 11:33 pm

That sounds terribly complicated - and letting any thread abort a system shutdown without even knowing it's the user? What if malware is the reason you're trying to shut down?

Pre-shutdown also seems unnecessary. It would add alot of programming work to applications and the OS to shave roughly 1/2 second off the time.

Having shutdown levels sounds complicated. UNIX's have init levels, so maybe merge the two concepts?

IMHO, better yet, if the filesystem can be trusted, have it flush everything that's not synchronized back to disk and just hard shutdown without telling the processes anything and let programmers know that that's what you'll do?

phioust · Post by **phioust** » Sun Nov 11, 2007 11:42 pm

Brendan wrote:Is it necessary to close networking connections? For all networking, other computers need to tolerate the sudden disappearance of a computer anyway (network faults, power failures, etc).

It is a nice "feature" for the os to properly teardown all connections ( force RST ) because for many clients ( ssh, ftp, http ) they can immediately close and in the case of ssh clients return you to your prompt instead of freezing the terminal because the connection timed out and then making you open another connection to your original shell ( which can be a pain if you are sshed through multiple machines )

similarly i would think for servers/applications that are clustered/running in farms, the time spent waiting for connections to timeout fully instead of dying right away would be noticable

also with situations ex: point to point vpns with redundancy/multiple endpoints - if one end point goes down and the other has to wait for it to timeout to find out instead of being able to immediately find another endpoint could cause inner client timeouts etc

maybe I beat this point to death but I wanted to give some good examples and I definately think the time spent resetting connections is worth the downtime

Brendan · Post by **Brendan** » Mon Nov 12, 2007 2:11 am

Hi,

Brynet-Inc wrote:It may seem a bit slow.. but it rarely takes more then a few minutes maximum.

Does the process differ that much on other systems?

I think all OSs are essentially similar.

frank wrote:Yeah I have the problem that even if I close all programs it can still take vista 5 minutes to shutdown. Some of the things that I think vista does when it shuts down are updated the readyboot cache and the readyboost cache.

Five minutes! I started this forum topic because I was annoyed at my WindowsXP machine after it spent about 30 seconds doing nothing important while shutting down...

frank wrote:Of course I always hibernate so it usually only takes about 1 - 2 minutes to "shutdown".

IMHO the reason for "hibernate" is that OSs are crap and take far too long to boot and shutdown (if an OS booted in 3 seconds and shutdown in one second, then why would you bother supporting hibernate?).

Hibernate, readyboot and readyboost are all technologies that increase bloat (that are meant to reduce some of the negative effects of increased bloat). Try an experiment - install Windows 95 on your computer and see if boot and shutdown are faster without all these "advanced" technologies...

frank wrote:On second thought I wouldn't mind my computer taking 10 minutes to shutdown if it booted to a usable desktop in 5 seconds or so, including BIOS time.

I still remember the Commodore 64 - it booted in less than 1 second and shutdown was instantaneous. In 20 years time are we going to be standing around our kitchen appliances saying "if this micro-wave oven only took 5 seconds to start up I wouldn't mind if it took 10 minutes to turn off"?

I tested my machine (Pentium 4 with Windows XP) - during boot it takes 10 seconds for the BIOS, another 20 seconds to get to the login screen, another second to get to a desktop (ie. you can see the desktop) and then another 4 seconds to get to a usable desktop (ie. the hourglass disappears). For shutdown (immediately after a fresh boot, where there should be no modified data to save) it took 5 seconds to turn power off.

I tried shutdown again, but this time I started 3 web browser windows (Internet Explorer with a variety of web sites), one Adobe Reader window (with a PDF file from Intel's manuals), a small Word document, a notepad window and a few folders. This was meant to represent typical working conditions (but there should still be no modified data to save during shutdown - I didn't edit anything, copy any files, change any options/settings, etc). This time it took 20 seconds to shut down the system - just having those applications, etc running added 15 seconds to the shutdown time.

I'm guessing Windows may have needed to load the shutdown code from disk and would've needed to write to a few sectors of the disk as part of unmounting file systems (so it remembers that the OS was shutdown and doesn't do file system checks the next time it boots), but otherwise I could've unplugged the computer from the power socket in the wall ("instant off") and it wouldn't have mattered. Basically, for both these cases shutdown should've taken a fraction of a second.

Even with cached data (and things like ReadyBoost?), you'd expect the OS to write data to disk during idle time so that there's still very little to do during shutdown (and so there's less chance of data loss in case of power failures, etc). For this computer, 20 seconds is probably enough time to transfer the entire contents of RAM to the hard drive (512 MB of RAM at about 50 MB/second).

Cheers,

Brendan

Brendan · Post by **Brendan** » Mon Nov 12, 2007 2:15 am

Hi,

Avarok wrote: That sounds terribly complicated - and letting any thread abort a system shutdown without even knowing it's the user? What if malware is the reason you're trying to shut down?

If it was malware you'd kill the offending process rather than shutting down the system. It would also be possible to add a "shutdown or else" option where threads can't abort the shutdown.

Avarok wrote:Pre-shutdown also seems unnecessary. It would add alot of programming work to applications and the OS to shave roughly 1/2 second off the time.

Pre-shutdown is optional - if a thread ignores it (doesn't support pre-shutdown at all) then it won't matter.

Half a second may seem pointless for systems that take ages to shutdown (for e.g. there's not much difference between 30 seconds and 30.5 seconds). I'm hoping shutdown for my OS will take less than one second so it would make a noticeable difference (e.g. half a second compared to a full second).

Avarok wrote:Having shutdown levels sounds complicated.

Correctly shutting down is complicated. For example, imagine that during boot the OS does "mount /dev/hda /foo" then "mount /foo/myDiskImage /bar", and you shutdown while a process is writing to "/bar/someFile.txt". In this case you need to wait for "/bar/someFile.txt" to be written, then wait for caches to be flushed for "/bar", then wait for caches to be flushed for "/foo", then flush the disk driver's caches. You can't do this in the wrong order without losing data. For example, if you unmount "/foo" before you unmount "/bar", then "/bar" won't be able to flush it's data.

This implies that the OS must be some method of determining the order in which processes should be shutdown, but this is easy to automate. Device drivers start with the highest shutdown level, and normal applications start with the lowest shutdown level. When ProcessA mounts a file or device handled by ProcessB, then the OS makes sure that ProcessA's shutdown level is lower than ProcessB's shutdown level, and if it isn't you'd do "ProcessA.shutdownLevel = ProcessB.shutdownLevel - 1".

That way the shutdown levels are always in a suitable order, you don't need messy scripts to control things, and you can have many many levels (e.g. A mounts B which mounts C which mounts D which mounts E and so on, for several billion levels).

Avarok wrote:UNIX's have init levels, so maybe merge the two concepts?

UNIX init levels wouldn't change much. If you're switching to runlevel 0 (halt) or runlevel 6 (reboot) from any other runlevel, then it'd still be good to do it efficiently...

Avarok wrote:IMHO, better yet, if the filesystem can be trusted, have it flush everything that's not synchronized back to disk and just hard shutdown without telling the processes anything and let programmers know that that's what you'll do?

I'm mostly looking at "polite shutdown without data loss". For e.g. if someone was using a text editor and forgot to save their data, then during shutdown the text editor can ask them what to do with the modified data instead of losing the data.

Cheers,

Brendan

Avarok · Post by **Avarok** » Mon Nov 12, 2007 3:03 am

Heh... flushing cached files to disk would save any text documents you have open. In your example above; what really would happen is that /foo/myDiskImage (is an image, not a disk?) is actually a file on /dev/hda. It would therefore only have a single file marked "dirty", which is that image which we're clearly just modifying. This would be in memory already.

When the system goes down, it just tells the filesystem to flush() everything to disk, and it makes that change.

In fact, you'd have to let programmers know that's what you're doing there so they'd write back to the same file in memory ONLY if they could accept it being flushed that way.

Brendan · Post by **Brendan** » Mon Nov 12, 2007 4:00 am

Hi,

Avarok wrote:Heh... flushing cached files to disk would save any text documents you have open. In your example above; what really would happen is that /foo/myDiskImage (is an image, not a disk?) is actually a file on /dev/hda. It would therefore only have a single file marked "dirty", which is that image which we're clearly just modifying. This would be in memory already.

When the system goes down, it just tells the filesystem to flush() everything to disk, and it makes that change.

In fact, you'd have to let programmers know that's what you're doing there so they'd write back to the same file in memory ONLY if they could accept it being flushed that way.

Um...

The text editor opens the file as "read only", reads it into memory and closes the file. Then the user edits the file in RAM. At this point you've got modified data in RAM and no open files.

Now the user wants to shutdown the OS. The disk driver has no modified data and the file system code has no modified data.

The text editor asks the user what to do with the modified text file in RAM. The user might want to save the file as a different file name, or they might not want the changes saved at all.

Cheers,

Brendan

Candy · Post by **Candy** » Mon Nov 12, 2007 6:09 am

Brendan wrote:Correctly shutting down is complicated. For example, imagine that during boot the OS does "mount /dev/hda /foo" then "mount /foo/myDiskImage /bar", and you shutdown while a process is writing to "/bar/someFile.txt". In this case you need to wait for "/bar/someFile.txt" to be written, then wait for caches to be flushed for "/bar", then wait for caches to be flushed for "/foo", then flush the disk driver's caches. You can't do this in the wrong order without losing data. For example, if you unmount "/foo" before you unmount "/bar", then "/bar" won't be able to flush it's data.

This implies that the OS must be some method of determining the order in which processes should be shutdown, but this is easy to automate. Device drivers start with the highest shutdown level, and normal applications start with the lowest shutdown level. When ProcessA mounts a file or device handled by ProcessB, then the OS makes sure that ProcessA's shutdown level is lower than ProcessB's shutdown level, and if it isn't you'd do "ProcessA.shutdownLevel = ProcessB.shutdownLevel - 1".

That way the shutdown levels are always in a suitable order, you don't need messy scripts to control things, and you can have many many levels (e.g. A mounts B which mounts C which mounts D which mounts E and so on, for several billion levels).

Tree, breadth first search, post-visit destruct. Algorithm done.

Avarok · Post by **Avarok** » Mon Nov 12, 2007 6:40 am

Sure, in Windows or Linux maybe. Reasonable intelligence [such as the stuff I know you possess] however would suggest that if we're modifying a file in an editor that we want to save later as well, and that we need the data in RAM and hey! we have it in RAM already. Why not just map the file as it stands directly into the process and run with it?

Perhaps I got the wrong idea when you suggested you were trying to make startup and shutdown faster when I thought that meant you were considering the whole grand inefficiency of the beast.

Brendan · Post by **Brendan** » Mon Nov 12, 2007 9:17 am

Hi,

Avarok wrote: Sure, in Windows or Linux maybe. Reasonable intelligence however would suggest that if we're modifying a file in an editor that we want to save later as well, and that we need the data in RAM and hey! we have it in RAM already. Why not just map the file as it stands directly into the process and run with it?

How the data gets into RAM makes no difference - it's how modified data gets saved back to disk that matters.

For memory mapped files, you couldn't map the file as "read/write" (and destroy the original file regardless of whether the user wants to or not). You could map the file as "read only" and use something like "copy on write", but the file system only knows the original file name and doesn't know what file name the modified data should be saved as.

Also, the data in RAM may not be in the same format as the original file on disk. For example, the text editor might convert the data from ASCII to UTF-32.

Lastly, when the file is saved it might be saved in a completely different format to the format used y the original file or the format used in RAM. For example, Windows Notepad's "Save As" dialog box offers ANSI, Unicode, Unicode big endian and UTF-8 as possible options. Kwrite's "Save As" dialog box offers almost 50 different output formats.

Cheers,

Brendan

exkor · Post by **exkor** » Mon Nov 12, 2007 11:22 am

Brendan wrote: I'm mostly looking at "polite shutdown without data loss". For e.g. if someone was using a text editor and forgot to save their data, then during shutdown the text editor can ask them what to do with the modified data instead of

The most polite shutdown imho is the one that firefox has. It closes windows,tabs and when you start it next time it asks if you wanna restore previous windows. Such shutdown works anytime when you don't click close button.
I have some programs(network related that keep constant connection with server) They show close window dialog before computer reboots. Well, I never make it to push those buttons... but if I would make it it would mean slow shutdown.