bzt wrote:No, much more like sad. Very sad. If you really meant it, OSDev is definitely in a decline. It is not funny at all when something so basic like the advantages of dynamic linking has to be explained on an OSDev forum.
I don't need it explained, I just don't think the advantages outweigh the disadvantages.
bzt wrote:No serious programmer would ever think for a moment that deduplication of functions would require more storage space or more memory.
Because miracles come for free. I'm not so naive. In order to support that, you need overhead (PLT, GOT, dynamic linker, PIC), and often that overhead is not worth it. Plus, in this day and age, optimizing for storage space or memory seems like a lost cause, when 16GB of RAM cost like 100€. I'd rather optimize for time.
bzt wrote:I'm afraid you lack some required and pretty essential knowledge my friend.
Yes, please regale me with your stories, oh wise one. Could you try to be more arrogant next time?
bzt wrote:libjpeg is 603K.
So that is the hill you wish to do battle on? Alright.
cjpeg is a little program using libjpeg to compress an image. You have to trick its build system pretty hard into generating a statically linked version (what is the point of libtool if it just gets in the way of actually linking the application?), but I managed to generate a statically linked executable. With a PLT and a GOT inside -- I don't know how that works. No, it is not a PIE.
Anyway, the file contains four LOAD segments. Rounding the memory size of each up to the next 4k line (because page granularity) and adding those up yields 1116 kB.
Now for the dynamic executable. Now, I could just score a quick victory here, because some genius set the alignment for all LOAD segments of libjpeg and all executables to 2 MB (meaning a single LOAD segment like that has outgrown the entire static application already), but I like to think I'm more professional than that. So, a quick recompilation later: cjpeg once again has four LOAD segments, weighing in at 72 kB. libjpeg has another four LOAD segments, at 492 kB together. The dynamic interpreter has another four LOAD segments (is that the new fad? I thought two of those ought to be enough for everyone), clocking in at 164 kB. And, finally, libc has another four LOAD segments, weighing 1792 kB in toto.
All in all, you have to map 2520 kB of virtual memory just to finish loading this file. The only saving grace is that 1792+164 kB of this are likely already in memory. Of which I have to immediately deduct 44 kB for being non-sharable data.
bzt wrote:but since with static linking nothing can be shared,
And you claim I lack essential knowledge. With static linking, text is shared between all processes with the same executable. Which happens a lot. For instance, right now my laptop is running 28 instances of udevd (what are they DOING?), 5 gettys, 4 rdnssd, 4 dbus-daemon, 4 "Web Content" (which is a Firefox thing), etc. pp.
And all this sharing would be possible without any relocation processing (if I had linked everything statically, that is).
BTW: I am always claiming there is PIC overhead. Well, I have the static and shared libjpeg right here in front of me. And while the static library has a larger file size, this is mostly due to ELF header overhead (which the linker will strip away). Using "size", I can tell that the shared lib is 3.5% larger than the static one. That's for AMD64, by the way. I suppose for x86 it would be worse. (x86 PIC is horrible)
bzt wrote:Since every program is smaller by 602K,
And how many programs with jpeg support are you running simultaneously?
bzt wrote:you'll need less memory for the processes. Furthermore libjpeg is only loaded once, and just mapped into their address space (no additional RAM allocated). If GOT gets its own page, even then that's -600K per process.
Yes, I have noticed you chose a library with little changable data (< 4kB). Well done.
bzt wrote:Yes it has to run, but you don't have to load an extra 602K from a slow disk for every process, just add an offset to 23 words
That is fuzzy enough to be true. However, it is not a constant offset.
cjpeg has 82 relocations of which 35 require a symbol lookup. My copy of libjpeg has 347 relocations of which 105 require a symbol lookup. ld.so has 49 relocations of which 9 require a symbol lookup, and finally libc.so has 1369 relocations, of which 77 require a symbol lookup. If LD_BIND_NOW is in effect (which has been suggested as a security hardening option for a long time now), that is 1847 relocations with 226 symbol lookups. Every time you start that application.
bzt wrote:What? You don't package libraries with the program! Quite the contrary, with dynamic linking you separate the libs into their own packages, and the OS takes care of them (more precisely the package manager like apt, port, pacman, brew etc. and the dynamic linker).
On inspection, it turns out that most of the bundled libraries I had seen were actually specific to the application in question. I withdraw that complaint.
Also, the new fad is containers, where people not only bundle all libraries, but also the rest of the OS. It's all so wasteful.
bzt wrote:But hey, what if you take into consideration all aspects of shared text? Like shared functions require less storage space, shared mapping requires less RAM, loading less means faster start up?
As given above, I don't care much about optimizing for storage or memory, so that last one is the only interesting one for me. It only works, however, if you assume the cost of looking up all the required libraries is negligible. If disk transfers are truly that slow, then surely, having to look up a lot of directories, almost all of which exist, only to find that the desired library is in another castle, cannot be so simply ignored. I am not convinced that this:
Code: Select all
execve(".libs/cjpeg", [".libs/cjpeg"], 0x7ffcc6868d80 /* 28 vars */) = 0
brk(NULL) = 0x559f64c00000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/opt/libjpeg-turbo/lib64/tls/haswell/x86_64/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
stat("/opt/libjpeg-turbo/lib64/tls/haswell/x86_64", 0x7ffca849ccc0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/opt/libjpeg-turbo/lib64/tls/haswell/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
stat("/opt/libjpeg-turbo/lib64/tls/haswell", 0x7ffca849ccc0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/opt/libjpeg-turbo/lib64/tls/x86_64/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
stat("/opt/libjpeg-turbo/lib64/tls/x86_64", 0x7ffca849ccc0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/opt/libjpeg-turbo/lib64/tls/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
stat("/opt/libjpeg-turbo/lib64/tls", 0x7ffca849ccc0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/opt/libjpeg-turbo/lib64/haswell/x86_64/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
stat("/opt/libjpeg-turbo/lib64/haswell/x86_64", 0x7ffca849ccc0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/opt/libjpeg-turbo/lib64/haswell/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
stat("/opt/libjpeg-turbo/lib64/haswell", 0x7ffca849ccc0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/opt/libjpeg-turbo/lib64/x86_64/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
stat("/opt/libjpeg-turbo/lib64/x86_64", 0x7ffca849ccc0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/opt/libjpeg-turbo/lib64/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
stat("/opt/libjpeg-turbo/lib64", 0x7ffca849ccc0) = -1 ENOENT (Datei oder Verzeichnis nicht gefunden)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=160735, ...}) = 0
mmap(NULL, 160735, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7efd09239000
close(3) = 0
openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/libjpeg.so.62", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220<\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0644, st_size=428032, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd09237000
mmap(NULL, 2523160, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7efd08fce000
mprotect(0x7efd09035000, 2097152, PROT_NONE) = 0
mmap(0x7efd09235000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x67000) = 0x7efd09235000
close(3) = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260A\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1824496, ...}) = 0
mmap(NULL, 1837056, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7efd08e0d000
mprotect(0x7efd08e2f000, 1658880, PROT_NONE) = 0
mmap(0x7efd08e2f000, 1343488, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x22000) = 0x7efd08e2f000
mmap(0x7efd08f77000, 311296, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16a000) = 0x7efd08f77000
mmap(0x7efd08fc4000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b6000) = 0x7efd08fc4000
mmap(0x7efd08fca000, 14336, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7efd08fca000
close(3) = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7efd08e0a000
arch_prctl(ARCH_SET_FS, 0x7efd08e0a740) = 0
mprotect(0x7efd08fc4000, 16384, PROT_READ) = 0
mprotect(0x7efd09235000, 4096, PROT_READ) = 0
mprotect(0x559f630b8000, 4096, PROT_READ) = 0
mprotect(0x7efd09288000, 4096, PROT_READ) = 0
munmap(0x7efd09239000, 160735) = 0
Is less work than just faulting in the pages from a statically linked executable.