Single source file

Kevin · Post by **Kevin** » Tue Jul 02, 2013 7:03 am

Antti wrote:Are you sure that having source dependencies like this
[...]
will make that particular unit modular?

No, but that's like hitting the hammer on a table instead of on the nail at the wall, and then claiming that a hammer obviously isn't the right tool to drive the nail into the wall.

The reverse is true: If you take the possibility to use multiple source files away, even a well-designed C programs will lose its modularity. It's how the language works.

I already said that the Linux kernel is too big to be considered as a reasonable unit. KDE and LibreOffice are not "one-executable compatible". If the program is too big, then it should be splitted on smaller programs (most of them are splitted already). These units could be one single-source files. They do not have source-code-level dependencies (like the example code above) with each other.

I'm not familiar enough with these projects, but I assume that apart from the obvious split (let's talk just about oowriter instead of LibreOffice) splitting the functionality in smaller separate executables would become tricky if not impossible (and involve lots of IPC).

Antti wrote:And yes, Kevin. Your previous post is quite tough and I could not invalidate all the points you made. It is sad that current conventions are based on the foundation that requires so much hacks to create, e.g. portable programs.

There's a simple fact behind it that explains why it's necessary: Humans mess up. They do, and you'll have a hard time changing that.

SDS · Post by **SDS** » Tue Jul 02, 2013 7:09 am

This is a discussion about the granularity of representing your project. The answer is obviously going to depend on the scale of your project.

For example; is 'modularity' a logical description of the way you structure your source code, or is it a real description of how your code runs (in an OS context, if you want to load drivers at runtime they are distinct from the core of the kernel - but they may still be part of the same project).

Do you have sections of code which are logically the same in multiple output contexts (e.g. drivers, or modules of an application), which need to be shared?

Permitting (although not enforcing) multiple files within a project enables a great deal of flexibility about how you map the logical granularity of your source code to the practical outputs. If the only thing you are interested in is one fairly simple executable that doesn't link up to anything else then go ahead and use a single file.

I have several repositories with 'independent' code that are all part of the same 'project' and get pulled to subdirectories of my working directory. The way that things connect up at build time is complex, but that is a fairly efficient description of the complexity associated with the problem while avoiding duplication.

Brendan · Post by **Brendan** » Tue Jul 02, 2013 3:16 pm

Hi,

Kevin wrote:
Antti wrote:Are you wasting your time for waiting the actual compiling or the highly frequent starting and stopping the compiler?
Spawning processes certainly does take some time. However what you're really waiting for is still the actual compiling.

There's the overhead of checking which pieces need to be updated (make), plus the overhead spawning processes (and doing dynamic linking, etc), plus the overhead of finding all the files ("open()" and seek times), plus the overhead of destroying processes. It adds up. Compiling is fast (e.g. for some languages it's limited by disk IO speed). What takes time isn't compiling, but optimising.

However; when you want to optimise, the "multiple files" method makes it impossible for the compiler to do it properly (whole program optimisation) and you end up with poor quality code. In an attempt to solve this all modern tool chains support some form of "link time optimisation"; but if you ask anyone that's used this they'll tell you it's extremely slow (especially for large projects) and that (because the compiler discards a lot of useful information before LTO occurs) it doesn't really help as much as it should.

Kevin wrote:
Like handling dependencies and rebuilding only these "one hundred" source files that depends on the header file I am modifying instead of all "two hundred". It solves the problem that we do not need to have at all.
You mean because we could just always compile the one whole source file that was created by merging the 200 files - so we don't even have to think about compiling only the half of it because it isn't possible any more?

You're still failing to see anything beyond the lame and broken existing tools that were designed to solve ancient resource constraints (not enough RAM to have compiler and all source code in memory at once) instead of being designed to do the job properly. For existing tool-chains; object files are used as a kind of half-baked cache to store the "pre whole program optimisation" results of compiling individual files (note: "half-baked" due to the failure to make it transparent and the inability to cache more than one version of each object file). With a tool-chain designed for "single source file"; there's no reason why you can't cache "pre whole program optimisation" results with much finer granularity (e.g. individual functions rather than whole "object files"). This can be done properly; so that it's transparent to the user (e.g. avoids the need for external "cache management tools" like 'make'), and it can also handle multiple versions simultaneously. In this way if someone only changes a few functions you'd compile less with single-source than you would've with multiple source files; and if you regularly build one version with debugging and one without (or several versions with different options) you'd prevent a massive amount of recompiling each time you change from one configuration to another.

Kevin wrote:But you do know that these 6000 lines in the configure scripts aren't all nops? A typical configure script does much more, checking which libraries and functions are available, testing for platform-dependent behaviour etc. and are an important part of making programs portable?

Agreed - existing tools were "evolved" by morons that can't decide on "standard standards", resulting in 6000 line configure scripts (and hideous hack-fests like "autoconf") to work around the insanity. Any excuse to throw this puke out and establish a "standard standard" that avoids the need for "6000 lines of evidence of poor tool design" is a good excuse. Note that this only applies for some tool-chains - more modern tool-chains (e.g. Java, .NET) do have effective standards and don't have this problem.

Kevin wrote:
Antti wrote:Then we have this IDE with "a tree-like view of the program" and it is easy to see the big picture.
You've never worked on a big project, have you? Can you imagine what the tree-like view of the Linux kernel would look like? Or because you like to say that kernels are different, let's take qemu, with which I'm more familiar anyway. (And that's still the kind of projects that compile in a few minutes, not hours or days like X, KDE or LibreOffice.)

Let's forget about software; and think about building trucks. There's many kinds of trucks; varying by size, purpose and payload (e.g. flat-bed vs. tanker vs. refrigerated vs...). If you were a company building trucks; you'd identify common "truck pieces" (steering wheel, CD player, seats, engine, fuel tanks, chassis, etc) and establish a set of standards that describe how these pieces fit together; and then design several alternatives for each piece. That way, a customer can come to you asking for a truck, and choose which of your pieces to combine to create their truck - maybe a diesel engine with an automatic gearbox, red seats with electronic adjustment and a flat-bed on the back; or maybe an natural gas engine with a 6-speed manual gearbox, cheaper seats and a "refrigerated box" on the back; or any combination of many possible pieces.

Now let's think about software. In fact, let's think of building an emulator like Qemu; but (just for a few minutes) assume we aren't typical code monkeys and are capable of *design*. We can start by identifying the types of pieces - we'd need a "CPU emulator" piece, a "chipset" piece, several different types of "PCI device" pieces, etc. Then we can design a set of standards that describe how these pieces fit together (I like my modules to communicate via. asynchronous message passing, but you can use shared library "plug ins" if you like). Now that we've got usable standards for how pieces fit together; I can write a "CPU emulator" piece that interprets instructions (like Bochs), you can write a "CPU emulator" piece that does JIT (like Qemu) and someone else can write a "CPU emulator" piece that uses Intel's "VT-x" hardware virtualisation. Someone else might write a "generic ISA" chipset piece, some might write a "Intel 945 chipset" piece and someone else might write an "AMD opteron chipset" piece. More people might create a "SATA controller" piece, a "realtek ethernet card" piece, a "OHCI USB controller" piece, etc.

The other thing we'd need is an "virtual machine management" piece for starting/stopping virtual machines and configuring them. To configure a virtual machine, you could have a set of icons representing all the different pieces of virtual hardware that's been installed, and the user can drag these icons into a work area to add the virtual hardware to their virtual machine.

Of course no single entity (company, person, group) would be responsible for implementing an entire emulator - it would be many different people that have each implemented one or more individual pieces that fit together. Some of the pieces might be open source, some might be closed source, they can be written in different languages, etc.

It doesn't have to be an emulator though. Something like "LibreOffice" could be a "spreadsheet front-end" piece, a "word processor front-end" piece, a "spell checker" piece, an "expression evaluator" piece, etc; where several competing versions of all of these pieces are written by different people and are inter-changeable.

The point is programmers shouldn't be writing applications to begin with. Designers should be designing standards for "pieces that work together" and programmers should be implementing these "pieces that work together".

If big projects are a problem for "single source file" (which I doubt); then that would be a good thing anyway because big projects shouldn't exist to begin with.

Cheers,

Brendan

Kevin · Post by **Kevin** » Wed Jul 03, 2013 2:37 am

Brendan wrote:Compiling is fast (e.g. for some languages it's limited by disk IO speed). What takes time isn't compiling, but optimising.

Okay, I agree. I oversimplified and called the whole thing "compiling", from reading the source file to linking the final binary. This was a bit sloppy, but I think it's still what Antti was asking for.

In an attempt to solve this all modern tool chains support some form of "link time optimisation"; but if you ask anyone that's used this they'll tell you it's extremely slow (especially for large projects)
[...]
You're still failing to see anything beyond the lame and broken existing tools that were designed to solve ancient resource constraints (not enough RAM to have compiler and all source code in memory at once) instead of being designed to do the job properly.

Right, so first of all you are failing to see that I'm replying to Antti who was talking about C, and not about your hypothetical OS that will probably never become reality.

The other thing is that you seem to have missed the connection here: LTO is slow because you don't only need RAM to have the compiler and all source code in RAM (which is already a lot for large projects), but also all of the temporary data used for optimisation, and the time complexity for most optimisations isn't growing linearly either. Which in effect means that your resource constraints are very real and not ancient at all.

In fact, compiling the 8 KLOC source file target-i386/translate.c in qemu has more than once caused troube because of the resource requirements when compiling with -O2. Without LTO. I don't even want to think about what would happen if you tried to compile all of qemu at once.

For existing tool-chains; object files are used as a kind of half-baked cache to store the "pre whole program optimisation" results of compiling individual files (note: "half-baked" due to the failure to make it transparent and the inability to cache more than one version of each object file). With a tool-chain designed for "single source file"; there's no reason why you can't cache "pre whole program optimisation" results with much finer granularity (e.g. individual functions rather than whole "object files"). This can be done properly; so that it's transparent to the user (e.g. avoids the need for external "cache management tools" like 'make'), and it can also handle multiple versions simultaneously. In this way if someone only changes a few functions you'd compile less with single-source than you would've with multiple source files;

Single-source means that at least you have to read in the complete source of a project and to check which functions have changed and which haven't. Or, of course, you already manage this in the file format that your IDE is using and trust that the file hasn't been touched externally - in this case you've really created an archive format for a multi-source environment and duplicated the functionality that you get from the file system traditionally. Moving functionality from the file system to each single user application is probably not a good idea.

and if you regularly build one version with debugging and one without (or several versions with different options) you'd prevent a massive amount of recompiling each time you change from one configuration to another.

Either the output has to change (like compiling with different options, or enabling debug printfs), then you still have to recompile. Or it doesn't change, then you already don't have to recompile today. For example, compile with debug symbols and then use strip instead of a recompile to create the version without debug information.

Agreed - existing tools were "evolved" by morons that can't decide on "standard standards", resulting in 6000 line configure scripts (and hideous hack-fests like "autoconf") to work around the insanity. Any excuse to throw this puke out and establish a "standard standard" that avoids the need for "6000 lines of evidence of poor tool design" is a good excuse.

http://xkcd.com/927/

Now let's think about software. In fact, let's think of building an emulator like Qemu; but (just for a few minutes) assume we aren't typical code monkeys and are capable of *design*. We can start by identifying the types of pieces - we'd need a "CPU emulator" piece, a "chipset" piece, several different types of "PCI device" pieces, etc. Then we can design a set of standards that describe how these pieces fit together (I like my modules to communicate via. asynchronous message passing, but you can use shared library "plug ins" if you like). Now that we've got usable standards for how pieces fit together; I can write a "CPU emulator" piece that interprets instructions (like Bochs), you can write a "CPU emulator" piece that does JIT (like Qemu) and someone else can write a "CPU emulator" piece that uses Intel's "VT-x" hardware virtualisation. Someone else might write a "generic ISA" chipset piece, some might write a "Intel 945 chipset" piece and someone else might write an "AMD opteron chipset" piece. More people might create a "SATA controller" piece, a "realtek ethernet card" piece, a "OHCI USB controller" piece, etc.

Right, so you're not really moving towards single-source, but more or less keeping the current structure. You're really just turning what is function calls today into RPCs, making the interfaces between the modules more expensive, but certainly not achieving any of Antti's goals like having one source file for the whole thing that the user sees.

It doesn't have to be an emulator though. Something like "LibreOffice" could be a "spreadsheet front-end" piece, a "word processor front-end" piece, a "spell checker" piece, an "expression evaluator" piece, etc; where several competing versions of all of these pieces are written by different people and are inter-changeable.

This requires stable ABIs, which take a massive effort to maintain. Do you think it's really worth this effort, even for interfaces that are really internal to one application?

Many of your thoughts seem to be based on the assumption that if people only wanted, they would create a perfect design that not only works for the initial version of the program (which is unrealistic enough) and stays the same through all extensions and requirement changes of the projects (well, I don't have to say anything about that, right?) This makes many of your points fundamentally flawed.

BMW · Post by **BMW** » Wed Jul 03, 2013 2:59 am

I quite like the idea.

However I think there should be a "pre-compiler" which just compiles all your stuff into 1 C file. If there are conflicting symbols it can rename some. Then you can distribute the one C file.

Your idea of splitting the program up (like Microsoft Office -> Excel, Word, Powerpoint etc), wouldn't always work though. For example, GCC. GCC takes about an hour to build on my machine. But I don't think it would be a good idea to split GCC up...

Brendan · Post by **Brendan** » Wed Jul 03, 2013 5:26 am

Hi,

Kevin wrote:
Brendan wrote:Compiling is fast (e.g. for some languages it's limited by disk IO speed). What takes time isn't compiling, but optimising.
Okay, I agree. I oversimplified and called the whole thing "compiling", from reading the source file to linking the final binary. This was a bit sloppy, but I think it's still what Antti was asking for.

In an attempt to solve this all modern tool chains support some form of "link time optimisation"; but if you ask anyone that's used this they'll tell you it's extremely slow (especially for large projects)
[...]
You're still failing to see anything beyond the lame and broken existing tools that were designed to solve ancient resource constraints (not enough RAM to have compiler and all source code in memory at once) instead of being designed to do the job properly.
Right, so first of all you are failing to see that I'm replying to Antti who was talking about C, and not about your hypothetical OS that will probably never become reality.

I'm not sure it matters - Antii (like me) is at least thinking about doing things differently. My main point is that you can't do "slightly different" and expect good results; and you do need to be prepared to design tools to suit.

Kevin wrote:The other thing is that you seem to have missed the connection here: LTO is slow because you don't only need RAM to have the compiler and all source code in RAM (which is already a lot for large projects), but also all of the temporary data used for optimisation, and the time complexity for most optimisations isn't growing linearly either. Which in effect means that your resource constraints are very real and not ancient at all.

In fact, compiling the 8 KLOC source file target-i386/translate.c in qemu has more than once caused troube because of the resource requirements when compiling with -O2. Without LTO. I don't even want to think about what would happen if you tried to compile all of qemu at once.

If you've got 8 GiB of RAM or more and you're struggling to compile an 8 KLOC source file; then do you honestly think it's fair to say "8 KLOC is too much!" instead of wondering why existing tools are so crappy that they choke? More realistic is that there's plenty of resources, so people use "make -j" to compile many files in parallel anyway.

Kevin wrote:
For existing tool-chains; object files are used as a kind of half-baked cache to store the "pre whole program optimisation" results of compiling individual files (note: "half-baked" due to the failure to make it transparent and the inability to cache more than one version of each object file). With a tool-chain designed for "single source file"; there's no reason why you can't cache "pre whole program optimisation" results with much finer granularity (e.g. individual functions rather than whole "object files"). This can be done properly; so that it's transparent to the user (e.g. avoids the need for external "cache management tools" like 'make'), and it can also handle multiple versions simultaneously. In this way if someone only changes a few functions you'd compile less with single-source than you would've with multiple source files;
Single-source means that at least you have to read in the complete source of a project and to check which functions have changed and which haven't. Or, of course, you already manage this in the file format that your IDE is using and trust that the file hasn't been touched externally - in this case you've really created an archive format for a multi-source environment and duplicated the functionality that you get from the file system traditionally. Moving functionality from the file system to each single user application is probably not a good idea.

Who cares - "each single user application" is one IDE and one compiler, which both need to agree on an established file format (even if it is "plain text with a strict grammar") anyway.

Kevin wrote:
and if you regularly build one version with debugging and one without (or several versions with different options) you'd prevent a massive amount of recompiling each time you change from one configuration to another.
Either the output has to change (like compiling with different options, or enabling debug printfs), then you still have to recompile. Or it doesn't change, then you already don't have to recompile today. For example, compile with debug symbols and then use strip instead of a recompile to create the version without debug information.

You've missed the point. To find the point, try compiling something with "-O2" and then compile again with "-Os"; then modify one comment in a random source file and compile with "-O2" again. 99.99% should still be "cached" from the first time you compiled with "-O2" because you only changed one irrelevant comment since it was compiled with "-O2" last.

Kevin wrote:
Agreed - existing tools were "evolved" by morons that can't decide on "standard standards", resulting in 6000 line configure scripts (and hideous hack-fests like "autoconf") to work around the insanity. Any excuse to throw this puke out and establish a "standard standard" that avoids the need for "6000 lines of evidence of poor tool design" is a good excuse.
http://xkcd.com/927/

It pleases me that other OSs will continue to have these "lack of standard standards" problems while my OS won't.

Kevin wrote:
Now let's think about software. In fact, let's think of building an emulator like Qemu; but (just for a few minutes) assume we aren't typical code monkeys and are capable of *design*. We can start by identifying the types of pieces - we'd need a "CPU emulator" piece, a "chipset" piece, several different types of "PCI device" pieces, etc. Then we can design a set of standards that describe how these pieces fit together (I like my modules to communicate via. asynchronous message passing, but you can use shared library "plug ins" if you like). Now that we've got usable standards for how pieces fit together; I can write a "CPU emulator" piece that interprets instructions (like Bochs), you can write a "CPU emulator" piece that does JIT (like Qemu) and someone else can write a "CPU emulator" piece that uses Intel's "VT-x" hardware virtualisation. Someone else might write a "generic ISA" chipset piece, some might write a "Intel 945 chipset" piece and someone else might write an "AMD opteron chipset" piece. More people might create a "SATA controller" piece, a "realtek ethernet card" piece, a "OHCI USB controller" piece, etc.
Right, so you're not really moving towards single-source, but more or less keeping the current structure. You're really just turning what is function calls today into RPCs, making the interfaces between the modules more expensive, but certainly not achieving any of Antti's goals like having one source file for the whole thing that the user sees.

I'd be splitting applications into modules and having one source file per module. For example, Qemu might become about 50 modules (with 50 source files) rather than one application (with almost 6000 files). Obviously code can call functions in the same module using normal function calls - it would be stupid to think otherwise (and I can't see where you got the idea that all function calls would use RPC).

Kevin wrote:
It doesn't have to be an emulator though. Something like "LibreOffice" could be a "spreadsheet front-end" piece, a "word processor front-end" piece, a "spell checker" piece, an "expression evaluator" piece, etc; where several competing versions of all of these pieces are written by different people and are inter-changeable.
This requires stable ABIs, which take a massive effort to maintain. Do you think it's really worth this effort, even for interfaces that are really internal to one application?

Many of your thoughts seem to be based on the assumption that if people only wanted, they would create a perfect design that not only works for the initial version of the program (which is unrealistic enough) and stays the same through all extensions and requirement changes of the projects (well, I don't have to say anything about that, right?) This makes many of your points fundamentally flawed.

Many of your thoughts seem to indicate under-developed problem solving skills. You've identified a potential problem ("the design of interfaces between modules must be perfect"). As an exercise (intended to improve your weak problem solving skills) try to think of solutions for this problem. Note: I can think of at least 2 solutions.

Cheers,

Brendan

Kevin · Post by **Kevin** » Wed Jul 03, 2013 5:51 am

Brendan wrote:If you've got 8 GiB of RAM or more and you're struggling to compile an 8 KLOC source file; then do you honestly think it's fair to say "8 KLOC is too much!" instead of wondering why existing tools are so crappy that they choke? More realistic is that there's plenty of resources, so people use "make -j" to compile many files in parallel anyway.

Did it never occur to you that it might depend on the content of an 8 KLOC file how many resources are necessary to compile and optimise it? You just can't cheat around algorithmic complexity. Either you do the optimisation or you don't. You can decide that the optimisation is crappy if it needs time and do away with it, but then you'll get unoptimised (crappy) code.

Who cares - "each single user application" is one IDE and one compiler, which both need to agree on an established file format (even if it is "plain text with a strict grammar") anyway.

If you think this is a good design pattern, you won't only do it for the IDE and compiler, but for most other applications as well.

Anyway, if you don't care that you effectively have (archived) files again, we're not discussing single-source projects any more. Because they aren't really single-source.

You've missed the point. To find the point, try compiling something with "-O2" and then compile again with "-Os"; then modify one comment in a random source file and compile with "-O2" again. 99.99% should still be "cached" from the first time you compiled with "-O2" because you only changed one irrelevant comment since it was compiled with "-O2" last.

Right, so you changed the granularity in which you detect code changes. So how is this connected to single-source?

It pleases me that other OSs will continue to have these "lack of standard standards" problems while my OS won't.

It's portable applications that have the problem, not OSes. Your OS probably won't contribute much to the problem because simply nobody cares if their program runs on your OS, but as soon as they do care, you're part of the "lack of standard standards" problem.

I'd be splitting applications into modules and having one source file per module. For example, Qemu might become about 50 modules (with 50 source files) rather than one application (with almost 6000 files). Obviously code can call functions in the same module using normal function calls - it would be stupid to think otherwise (and I can't see where you got the idea that all function calls would use RPC).

It would be a little more than 50, but okay. Still that's not the single-source thing Antti was talking about, where one user-visible application is one source file. (And I never said you'd lose all local function calls, just the inter-module ones - which are frequent enough. It would probably kill performance.)

Many of your thoughts seem to indicate under-developed problem solving skills. You've identified a potential problem ("the design of interfaces between modules must be perfect"). As an exercise (intended to improve your weak problem solving skills) try to think of solutions for this problem. Note: I can think of at least 2 solutions.

Without stable requirements you can't have a stable design, no matter how you do it. Your two solutions are probably not solving this problem (they may be solving a somehow related problem, though).

Oh, and my problem solving seems to be good enough at least to get things done. What can your OS be used for today?

Brendan · Post by **Brendan** » Wed Jul 03, 2013 8:50 am

Hi,

Kevin wrote:
Brendan wrote:If you've got 8 GiB of RAM or more and you're struggling to compile an 8 KLOC source file; then do you honestly think it's fair to say "8 KLOC is too much!" instead of wondering why existing tools are so crappy that they choke? More realistic is that there's plenty of resources, so people use "make -j" to compile many files in parallel anyway.
Did it never occur to you that it might depend on the content of an 8 KLOC file how many resources are necessary to compile and optimise it? You just can't cheat around algorithmic complexity. Either you do the optimisation or you don't. You can decide that the optimisation is crappy if it needs time and do away with it, but then you'll get unoptimised (crappy) code.

Before I made that comment I looked at "target-i386/translate.c" expecting to see an ugly mess. I was wrong - it's mostly just plain static functions followed by a (large) nested switch thing. Nothing special, nothing complex, and nothing a decent compiler shouldn't be able to handle easily on any "desktop 80x86" computer that's less than 20 years old. You've stated it's problem for GCC, but you've failed investigate the cause of the problem, and simply assumed that because it's a problem for GCC it's a problem for every compiler that can possibly exist. That's not a rational assumption.

Kevin wrote:
Who cares - "each single user application" is one IDE and one compiler, which both need to agree on an established file format (even if it is "plain text with a strict grammar") anyway.
If you think this is a good design pattern, you won't only do it for the IDE and compiler, but for most other applications as well.

You're right - instead of having 20 files for one picture I'll just have one file; and instead of having separate file for each frame of a video I'll just have one video file; and instead of having a separate file for each page of a word processing document I'll just have one file for the entire document. Of course this is what most OSs do for most types of data anyway, so I didn't think it was worth mentioning.

Kevin wrote:Anyway, if you don't care that you effectively have (archived) files again, we're not discussing single-source projects any more. Because they aren't really single-source.

Soon you'll be trying to pretend that a text file is really just thousands of little files containing one character each. I think what you're trying to say is it can be very similar to multiple files and that therefore you were wrong when you thought it had to be worse.

Kevin wrote:
You've missed the point. To find the point, try compiling something with "-O2" and then compile again with "-Os"; then modify one comment in a random source file and compile with "-O2" again. 99.99% should still be "cached" from the first time you compiled with "-O2" because you only changed one irrelevant comment since it was compiled with "-O2" last.
Right, so you changed the granularity in which you detect code changes. So how is this connected to single-source?

You were attempting to suggest that single-source was worse for compile times because you couldn't just recompile the pieces that changed and had to compile the entire thing. I'm only showing that it can be far superior to the "multiple source file mess".

Kevin wrote:
It pleases me that other OSs will continue to have these "lack of standard standards" problems while my OS won't.
It's portable applications that have the problem, not OSes. Your OS probably won't contribute much to the problem because simply nobody cares if their program runs on your OS, but as soon as they do care, you're part of the "lack of standard standards" problem.

I only really care about my OS; and my OS will force people to use a single set of open standards for all things. If people port my standards to other OSs (and make the "lack of standard standards" problem on those other OSs worse) then that would be other OS developer's problem and not my problem.

Kevin wrote:
I'd be splitting applications into modules and having one source file per module. For example, Qemu might become about 50 modules (with 50 source files) rather than one application (with almost 6000 files). Obviously code can call functions in the same module using normal function calls - it would be stupid to think otherwise (and I can't see where you got the idea that all function calls would use RPC).
It would be a little more than 50, but okay. Still that's not the single-source thing Antti was talking about, where one user-visible application is one source file. (And I never said you'd lose all local function calls, just the inter-module ones - which are frequent enough. It would probably kill performance.)

Antti seemed to be saying similar things, pointing out that an entire application (from the user's point of view) is rarely one executable. Even simple applications on boring old *nix clones are typically implemented using several shared libraries. You were talking as if (e.g.) KDE or LibreOffice are implemented as a single executable.

For performance; look at my signature and really think how asynchronous message passing fits in with this.

Kevin wrote:
Many of your thoughts seem to indicate under-developed problem solving skills. You've identified a potential problem ("the design of interfaces between modules must be perfect"). As an exercise (intended to improve your weak problem solving skills) try to think of solutions for this problem. Note: I can think of at least 2 solutions.
Without stable requirements you can't have a stable design, no matter how you do it. Your two solutions are probably not solving this problem (they may be solving a somehow related problem, though).

Well, that's a total of zero solutions. Should I give you another chance to improve your weak problem solving skills (or another chance to show that they aren't weak); or should I give up and spoon feed you solutions that you should've been able to think of when you first thought of the potential "the design of interfaces between modules must be perfect" problem?

Kevin wrote:Oh, and my problem solving seems to be good enough at least to get things done.

I don't doubt that you're able to get things done; but re-implementing designs that you've read about in old university text books and seen in other people's software doesn't qualify as problem solving because all the problems have already been solved by other people.

Kevin wrote:What can your OS be used for today?

In 10 years time (after you've put a large amount of work into it), what will your OS be able to do that existing OSs don't already do today?

For these forums there are (roughly) 3 kinds of people. There are "learners", "followers" and "inventors".

I am an "inventor". I'm currently designing languages and file formats, and implementing a tool-chain that I'll use to create the next version of my OS (which obvious doesn't exist yet, and therefore does nothing). Being an inventor involves a lot more research, has a lot higher risk and takes a lot of trial and error; but it's the only way truly innovate.

You're not a learner. I suspect you're a "follower". A follower is someone like Linus Torvalds, who follows the work done by previous "inventors". They don't innovate. They succeed (if they're extremely lucky) by having a "more favourable" implementation than all the other followers. Furthermore, I suspect that most of what you're following is 30 years old now, and that the chance of you ending up with a "more favourable" implementation than (e.g.) Windows, OS X or Linux (which all have a significant head start) is zero. Of course sooner or later you will come to understand that the path you're following is a dead end. When this happens you might begin making the transition from "follower" to "inventor" and start looking for ways to do things that OSs don't already do.

Basically; my OS currently does nothing, and in 15 years time you might catch up to where I am now.

Cheers,

Brendan

bluemoon · Post by **bluemoon** » Wed Jul 03, 2013 9:25 am

Brendan wrote:You're right - instead of having 20 files for one picture I'll just have one file; and instead of having separate file for each frame of a video I'll just have one video file; and instead of having a separate file for each page of a word processing document I'll just have one file for the entire document. Of course this is what most OSs do for most types of data anyway, so I didn't think it was worth mentioning.

IMO code source is a bit different from word documents - there are usually multiple editors on the source, its more like wiki.
As with wiki, it does not matter to most people how the data is stored, on one file, multiple files, or remote database.

The thing that matter to most people is how to toolchain work, almost all toolchain and IDE assume file-based organisation, and store meta-info on project preference; there is obvious lack on the other choice for any sane comparison.

iansjack · Post by **iansjack** » Wed Jul 03, 2013 9:59 am

instead of having a separate file for each page of a word processing document I'll just have one file for the entire document

I think a better analogy might be to say "instead of having a separate file for each chapter of a book I'm writing I'll just have one file for the entire book".

It's certainly one way of working, and has some obvious advantages, but I think that most authors find the even more obvious disadvantages outweigh them. Truth is, when I work on a large program I tend to concentrate on one aspect of it at a time (having broken the problem down into a suitable set of modules). I find it easier to work on those modules as separate entities rather than as portions of a larger item.

This is increasingly the way that programming languages work, many of them insisting on a separate file for each class. Of course, it could be that all the professionals who design these languages are the ones who are out of step. But, given the choice between those with a proven body of work to their credit and people here who have nothing to show for their theory, I tend to go along with the consensus of experience.

Antti · Post by **Antti** » Wed Jul 03, 2013 10:27 am

It is starting to feel prehistoric to refer to the original post. I am definitely a "learner" right now but I may skip the "follower" part in the future. It does not mean to have ambitious plans or anything like that. On a smaller scale, it would be very interesting to "create" something that would be

that is the way to go. Your way to do this is elegant. If you put serius effort into this, this could be successful. You have a good start.

One of the reason for the first post (and the idea) is to simplify things. More you read about and spend time with programming, the less you see how normal users understand computers and programs. I think one greatly simplifying thing when it comes to source code would be to bring it closer to users. If we had a source file that is "the program", it would be very nice. "Just click and run this program".

If I take a look at current programs installed on my computer, I would say that a big part of them are so small that could very well be in "one source file". I want to say this again: I do not want to edit or even see big chunk of code at one time. I want to see a maintainable unit of code when I do programming. This is the reason I said, at the very beginning, that we need to have "an IDE" that will make this possible.

I open my video files with a video player. The word documents, videos etc. analogy was good.

Antti · Post by **Antti** » Wed Jul 03, 2013 10:43 am

iansjack wrote:Truth is, when I work on a large program I tend to concentrate on one aspect of it at a time (having broken the problem down into a suitable set of modules).

So do I, that is sure. I do not understand why files are superior way to separate them from each other. The file, as an abstraction, is very limited. What else you got than a file name? Of course, you have the directories. What if we have database system in which each "file" has a rich set of attributes? Those could be used for several different purposes.

iansjack · Post by **iansjack** » Wed Jul 03, 2013 10:59 am

Well, the obvious advantage of files over databases is that almost every OS supports files natively, but not databases. (I know that some OSs are database friendly, OS/400 for example, but it is very much the exception rather than the norm.) I like to work with the KISS principle; use the facilities that the OS provides, if they do the job, rather than introducing additional complexity in the form of a database. If I can't edit it with vi, I don't want to know.

And, of course, any half-decent OS has the ability to store meta-data along with a file; you don't need the extra structure of a database for that.

Antti · Post by **Antti** » Wed Jul 03, 2013 11:16 am

iansjack wrote:Well, the obvious advantage of files over databases is that almost every OS supports files natively, but not databases.

If that "database" is implemented inside that single-source file? Almost every OS will support it natively. I agree that it can be difficult to edit with vi. You need an appropriate tool to edit it. However, if implementing this idea with C (as prehistoric it is when considering the points made in this thread) and strict meta-info headers, you could make some small modifications with vi.

iansjack wrote:any half-decent OS has the ability to store meta-data along with a file

It is not portable enough.

iansjack · Post by **iansjack** » Wed Jul 03, 2013 11:51 am

It is not portable enough.

It's a damn sight more portable than a file which needs a special program to edit it!

OSDev.org

Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file

Re: Single source file