How to handle read & write failures in a file system
How to handle read & write failures in a file system
Some sectors not being possible to read (or write) seems like the most complicated issue in a file system, quite on parity with handling removing the disc device. I don't like the idea to clutter all of the filesystem code with tests for correctly reading & writing sector data. There also is no straightforward method to propagate a failure to read the disc to user-mode operations like open, read & write, and besides, I don't want this mess in the file API either.
So, I don't want the disc buffering, metadata decoding, and not even the file chunk reading & writing to need to test for bad sectors. If the disc is attached, and the operation is correct, then it should always return success, regardless of problems in the raw data from the disc drive. My idea is that if a sector is bad (cannot be read), then the disc driver will fire an event with information about operating type, partition, and relative sector. Userspace can record these events and provide alarms if there are too many so the disc drive can be exchanged. The file system server can also handle these events and fix issues in the filesystem to minimize impact. Bad sectors will by default be filled with zeros in the disc cache, but this can be modified by the server.
How do Linux and Windows handle this?
So, I don't want the disc buffering, metadata decoding, and not even the file chunk reading & writing to need to test for bad sectors. If the disc is attached, and the operation is correct, then it should always return success, regardless of problems in the raw data from the disc drive. My idea is that if a sector is bad (cannot be read), then the disc driver will fire an event with information about operating type, partition, and relative sector. Userspace can record these events and provide alarms if there are too many so the disc drive can be exchanged. The file system server can also handle these events and fix issues in the filesystem to minimize impact. Bad sectors will by default be filled with zeros in the disc cache, but this can be modified by the server.
How do Linux and Windows handle this?
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: How to handle read & write failures in a file system
How does an application tell the difference between a successful read of a sector full of zeroes and a failed read?rdos wrote:If the disc is attached, and the operation is correct, then it should always return success, regardless of problems in the raw data from the disc drive. [...] Bad sectors will by default be filled with zeros in the disc cache, but this can be modified by the server.
Re: How to handle read & write failures in a file system
I think the main question is what an application should do if it gets an unexpected read error on a file. A lot of code would either fail the entire operation, possibly aborting some critical application. The other alternative is that it will ignore the error and reuse the previous buffer content. There really isn't many other alternatives since it cannot guesswork the contents. By filling the content with zeros youOctocontrabass wrote:How does an application tell the difference between a successful read of a sector full of zeroes and a failed read?rdos wrote:If the disc is attached, and the operation is correct, then it should always return success, regardless of problems in the raw data from the disc drive. [...] Bad sectors will by default be filled with zeros in the disc cache, but this can be modified by the server.
avoid reusing the previous buffer.
Re: How to handle read & write failures in a file system
Linux and Windows just pass the error through the file system and pass it to the user. (It gets more complicated when the error happens after the file is closed; then the error should generally be returned from sync() or similar instead.)
Well-designed applications can handle I/O errors (e.g., common databases), others just die when they encounter errors.
Well-designed applications can handle I/O errors (e.g., common databases), others just die when they encounter errors.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: How to handle read & write failures in a file system
That implies the error is in the file data sectors. What if some part of the directory entry that the file reside in is bad? Or if some part of the FAT table? The latter errors cannot reasonably be reported to the application as part of open, read or write.Korona wrote:Linux and Windows just pass the error through the file system and pass it to the user. (It gets more complicated when the error happens after the file is closed; then the error should generally be returned from sync() or similar instead.)
Databases are typically not implemented inside file systems.Korona wrote: Well-designed applications can handle I/O errors (e.g., common databases),
Take another example. One sector of an executable file is bad. If somebody tries to load the executable, will the load operation itself hinder it from being run? Or will accessing that particular part cause random garbage to execute?
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: How to handle read & write failures in a file system
Why not? The application doesn't care why the operation failed, just that it did.rdos wrote:What if some part of the directory entry that the file reside in is bad? Or if some part of the FAT table? The latter errors cannot reasonably be reported to the application as part of open, read or write.
If you always return success, garbage will be executed.rdos wrote:Take another example. One sector of an executable file is bad. If somebody tries to load the executable, will the load operation itself hinder it from being run? Or will accessing that particular part cause random garbage to execute?
If you propagate failures up to the requestor, then whatever requested to put that part of the executable into memory will see the failure. If the loader attempts to put the entire executable into memory before running it, the loader will see the error and fail to start the program. If the executable is loaded as needed using demand paging, the page fault handler will see the error and terminate the program.
Re: How to handle read & write failures in a file system
Why can't meta data errors be reported? Linux and Windows report them just like data I/O errors. Linux returns EIO, Windows returns ERROR_IO_DEVICE (altough Windows also has some more specific error codes).
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Re: How to handle read & write failures in a file system
If the directory entry that contains the file name is bad, then you will get back "file not found". Unless you interpret this as an error and report IO error, but then this will be reported regardless of which file you try to open. If the FAT is corrupt, then the alternative FAT might be used, and then the application shouldn't get an error, but you still have a serious error condition on the disc that essentially will go unreported. Basically, both of these are conditions that are better reported in a disc event interface rather than through normal file IO.Octocontrabass wrote:Why not? The application doesn't care why the operation failed, just that it did.rdos wrote:What if some part of the directory entry that the file reside in is bad? Or if some part of the FAT table? The latter errors cannot reasonably be reported to the application as part of open, read or write.
Zeros is not random garbage, and when executed on x86 will typically result in faults.Octocontrabass wrote:If you always return success, garbage will be executed.rdos wrote:Take another example. One sector of an executable file is bad. If somebody tries to load the executable, will the load operation itself hinder it from being run? Or will accessing that particular part cause random garbage to execute?
This is very problematic behavior for embedded systems. In this case you don't know why the program terminated, and so you simply try to retstart it in a loop. If the program faults instead, or report disc events to another interface, then the supervisor can decide there is some fatal problem somewhere and report this in a more adequate way rather than getting into a reboot-loop. Putting up error dialogs (which Windows typically does), is even worse and gives potential customers the idea they are dealing with poor software.Octocontrabass wrote: If you propagate failures up to the requestor, then whatever requested to put that part of the executable into memory will see the failure. If the loader attempts to put the entire executable into memory before running it, the loader will see the error and fail to start the program. If the executable is loaded as needed using demand paging, the page fault handler will see the error and terminate the program.
Re: How to handle read & write failures in a file system
I don't think they should be reported to the normal file IO interface simply because this will overload app code with error checks that will always lead to termination anyway. Not only that, the kernel interface becomes over-complicated too by these error checks & propagation, which leads to slower code. If you instead report these directly from the disc device, and distribute them to a specific event interface instead, then the filesystem code doesn't need to bother and gets less complicated & faster too.Korona wrote:Why can't meta data errors be reported? Linux and Windows report them just like data I/O errors. Linux returns EIO, Windows returns ERROR_IO_DEVICE (altough Windows also has some more specific error codes).
-
- Member
- Posts: 426
- Joined: Tue Apr 03, 2018 2:44 am
Re: How to handle read & write failures in a file system
The OS should never knowingly pass off known incorrect data as correct.rdos wrote:I think the main question is what an application should do if it gets an unexpected read error on a file. A lot of code would either fail the entire operation, possibly aborting some critical application. The other alternative is that it will ignore the error and reuse the previous buffer content. There really isn't many other alternatives since it cannot guesswork the contents. By filling the content with zeros youOctocontrabass wrote:How does an application tell the difference between a successful read of a sector full of zeroes and a failed read?rdos wrote:If the disc is attached, and the operation is correct, then it should always return success, regardless of problems in the raw data from the disc drive. [...] Bad sectors will by default be filled with zeros in the disc cache, but this can be modified by the server.
avoid reusing the previous buffer.
Critical application? Say your nuclear reactor control program needs to decide what to do next when it gets some dodgy sensor data? Your page full of zeros with no indication of error might just have sent it down the path to a meltdown. At least if your OS returns an error, the application can raise an alert for the operators to manually scram the reactor.
Re: How to handle read & write failures in a file system
It doesn't do it knowingly. The OS decides to report these errors in another interface, and then does it's best effort to continue to run despite operating with faulty hardware.thewrongchristian wrote: The OS should never knowingly pass off known incorrect data as correct.
I think a reboot loop is not going to help much in operating the nuclear plant. What actually will happen is that either the program will refuse to start, or it will terminate because of a faulty disc, and then end up in a reboot loop. I don't think reboot loops helps a lot in keeping the nuclear plant safe.thewrongchristian wrote: Critical application? Say your nuclear reactor control program needs to decide what to do next when it gets some dodgy sensor data? Your page full of zeros with no indication of error might just have sent it down the path to a meltdown. At least if your OS returns an error, the application can raise an alert for the operators to manually scram the reactor.
Re: How to handle read & write failures in a file system
In a nuclear plant, you'll have multiple redundant systems anyway, so it's actually preferable if one of them enters a reboot loop instead of behaving in an unexpected way.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
-
- Member
- Posts: 5568
- Joined: Mon Mar 25, 2013 7:01 pm
Re: How to handle read & write failures in a file system
You know you can have both, right? If there's a read or write error, report it through your disk event interface, and if the error causes file I/O to fail, report it that way too.rdos wrote:If the directory entry that contains the file name is bad, then you will get back "file not found". Unless you interpret this as an error and report IO error, but then this will be reported regardless of which file you try to open. If the FAT is corrupt, then the alternative FAT might be used, and then the application shouldn't get an error, but you still have a serious error condition on the disc that essentially will go unreported. Basically, both of these are conditions that are better reported in a disc event interface rather than through normal file IO.
This is how Windows and Linux both handle disk errors.
But not always. You need it to always result in a fault. (And what if the zeroes are in the program's data instead of executable code?)rdos wrote:Zeros is not random garbage, and when executed on x86 will typically result in faults.
Re: How to handle read & write failures in a file system
An alternative way to handle it is that the event code could examine where the problem is with the help of the filesystem driver, and then it can signal that a certain part of a file's data couldn't be read, and mark this in the user level file cache. Of course, it could also report this in the event data so code logging this will know which parts of the disc have problems, including specific files. This will still avoid to complicate and slow-down mainstream filesystem code with error checking & error propagation. Actually, the error logger could set a flag that there were recent errors, and then will wait for the event code to mark-up problems before it hands over buffers to the application doing file-IO.
Re: How to handle read & write failures in a file system
I don't get the "it slows down fs code" argument. It's just checking for a return value and propagating it, right? That's negligible compared to the cost of the syscall anyway. Plus, you probably need to check for other errors in the fs code anyway (e.g., out of disk space).
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].