My ELFs are so big

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
User avatar
AndrewAPrice
Member
Member
Posts: 2300
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

My ELFs are so big

Post by AndrewAPrice »

My userland C++ programs are so huge.

Simply GUI programs such as my calculator are 9.2MB! Non GUI programs are ~700KB.

I compile with:

Code: Select all

x86_64-elf-gcc -fverbose-asm -m64 -ffreestanding -nostdlib -nostdinc++ -mno-red-zone -c -std=c++20 -MD -MF <dep file> -DPERCEPTION  -Doptimized_BUILD_ -fdata-sections -ffunction-sections -g -O3  -fomit-frame-pointer  -isystem ../third_party/Libraries/libcxx/public -isystem <libraries> -D_GNU_SOURCE -D_LIBCPP_HAS_THREAD_API_C11 -DLIBCXXRT -D_LIBCPP_HAS_THREAD_API_PTHREAD -o <output> <source>
When compiling libraries (e.g. musl) I link together the object files with:

Code: Select all

x86_64-elf-gcc-ar rvs -o <output> <object files>
I link with:

Code: Select all

x86_64-elf-gcc -Wl,--gc-sections  -O3 -g -s -nostdlib  -nodefaultlibs  -nolibc -nostartfiles -z max-page-size=1 -T userland.ld -o <output> -Wl,--start-group <object files and library archives> -Wl,--end-group -Wl,-lgcc
This is my linker script.

Does anyone have suggestions on how I can optimize my binary sizes further?

I think it's because I link with libcxx and musl, and GUI programs link with Skia which adds a dependency on libxml2, harfbuzz, freetype, libjpeg. I would have thought whole program optimization would be able to eliminate most of the dead code.
My OS is Perception.
User avatar
eekee
Member
Member
Posts: 891
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: My ELFs are so big

Post by eekee »

Do the binaries include debug symbols? Stripping those (with `strip`) makes a huge difference to the on-disk size. None at all to the loaded size. Basic strip usage is just `strip objfile...` and the quickest way to see if your binaries aren't striped is just to try stripping one.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: My ELFs are so big

Post by Octocontrabass »

AndrewAPrice wrote:-mno-red-zone
Is there any particular reason why you're not using the red zone in userspace?
AndrewAPrice wrote:-O3
You're enabling optimizations that can increase the size of your binaries. I don't think this explains the several extra megabytes, but it might be worth comparing against "-O2" or "-Os" to see how much of a difference it makes.
AndrewAPrice wrote:I would have thought whole program optimization would be able to eliminate most of the dead code.
But you're not using whole-program optimization...
nullplan
Member
Member
Posts: 1790
Joined: Wed Aug 30, 2017 8:24 am

Re: My ELFs are so big

Post by nullplan »

AndrewAPrice wrote:My userland C++ programs are so huge.
Insert obligatory C programmer's "told-you-so" here. ;-)
AndrewAPrice wrote:When compiling libraries (e.g. musl) I link together the object files with:
That's not linking, that's archiving. And it has little impact on the size of the final executable, except for the fact that linkers act differently on archives than on object files on command line.
AndrewAPrice wrote: think it's because I link with libcxx and musl
At least musl is optimized for static linking.
AndrewAPrice wrote:and GUI programs link with Skia which adds a dependency on libxml2, harfbuzz, freetype, libjpeg.
Yikes! Well, with those dependencies, it is no wonder you are getting huge binaries.

libxml2 for example contains code to parse XML files. The linker cannot remove the various sub-cases that are not normally needed (e.g. CDATA is normally not used), nor can it remove stuff like the HTTP client (which is used to retrieve schema files if they start with "http://"). That can only be done at configure time. harfbuzz is apparently yet another parser, so it also has to contain the code for everything even if only a subset is needed. As does freetype.

I'm guessing skia is optimized for dynamic linking.
Carpe diem!
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: My ELFs are so big

Post by thewrongchristian »

Octocontrabass wrote:
AndrewAPrice wrote:-mno-red-zone
Is there any particular reason why you're not using the red zone in userspace?
I'd also question the use of "-ffreestanding -nostdlib -nostdinc++" for a user level programme.

In fact, the flags look like they are flags for compiling kernel files.
User avatar
AndrewAPrice
Member
Member
Posts: 2300
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: My ELFs are so big

Post by AndrewAPrice »

nullplan wrote: libxml2 for example contains code to parse XML files. The linker cannot remove the various sub-cases that are not normally needed (e.g. CDATA is normally not used), nor can it remove stuff like the HTTP client (which is used to retrieve schema files if they start with "http://"). That can only be done at configure time. harfbuzz is apparently yet another parser, so it also has to contain the code for everything even if only a subset is needed. As does freetype.

I'm guessing skia is optimized for dynamic linking.
I think you're right. As soon as you have the ability to load an image from a path, even if all you want to show is a humble .bmp, the library doesn't know until runtime of it's a bitmap, jpeg, png, svg, the compression library dependencies for each one, and for vector formats like SVG it needs to know how to load fonts and rasterize text and perform every visual effect under the sun and you end up with just about the entire library linked in.
My OS is Perception.
User avatar
AndrewAPrice
Member
Member
Posts: 2300
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: My ELFs are so big

Post by AndrewAPrice »

thewrongchristian wrote:
Octocontrabass wrote:
AndrewAPrice wrote:-mno-red-zone
Is there any particular reason why you're not using the red zone in userspace?
I'd also question the use of "-ffreestanding -nostdlib -nostdinc++" for a user level programme.

In fact, the flags look like they are flags for compiling kernel files.
I still need those flags (other than the no red zone flag) as I'm providing custom C and C++ standard libraries?
My OS is Perception.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: My ELFs are so big

Post by thewrongchristian »

AndrewAPrice wrote:
thewrongchristian wrote:
Octocontrabass wrote: Is there any particular reason why you're not using the red zone in userspace?
I'd also question the use of "-ffreestanding -nostdlib -nostdinc++" for a user level programme.

In fact, the flags look like they are flags for compiling kernel files.
I still need those flags (other than the no red zone flag) as I'm providing custom C and C++ standard libraries?
My understanding is you need -ffreestanding for kernels because kernels are loaded by a boot loader, and need startup code to initialise the CPU state correctly. The kernel is not started like a normal user process.

But your server process is just a normal user process (presumably), and should probably be started in a standard manner even if you're not using std C++ library.

But looking at your flags again, are you even excluding the C++ std library? You have -nostdlib (which would exclude standard libraries like startup libs and libgcc,) but you have no -nostdlib++, so you're actually including the C++ std library if I'm not mistaken. You might not end up pulling them in though, if you don't reference them.

If you don't use the C++ std library, then it won't be pulled in. If you're providing your own C++ library, fair play, but it seems like extra work excluding the std library unnecessarily.

I will admit, though, that this is all critique from a point of ignorance. My day job is MSVC C++, and it's a bastard mix of MFC and C++ std library, Urgh!

I would like my user space to have a relatively standard C++ library, but I've only got minimal C user space so far, so I'm curious what your motivation for avoiding the std library is/was? Was it a porting issue you didn't want to deal with, or perhaps you just fancied having a go implementing your own C++ library (understandable, we're all here for similar reasons?)
User avatar
AndrewAPrice
Member
Member
Posts: 2300
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: My ELFs are so big

Post by AndrewAPrice »

thewrongchristian wrote: I would like my user space to have a relatively standard C++ library, but I've only got minimal C user space so far, so I'm curious what your motivation for avoiding the std library is/was? Was it a porting issue you didn't want to deal with, or perhaps you just fancied having a go implementing your own C++ library (understandable, we're all here for similar reasons?)
I'm cross compiling for my OS from Mac and Windows. I want the compiler to use the musl and libcxx I ported to my OS (using my system calls and RPCs) and not the standard libraries for my Mac and Windows hosts.

Am I doing something wrong?
My OS is Perception.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: My ELFs are so big

Post by thewrongchristian »

AndrewAPrice wrote:
thewrongchristian wrote: I would like my user space to have a relatively standard C++ library, but I've only got minimal C user space so far, so I'm curious what your motivation for avoiding the std library is/was? Was it a porting issue you didn't want to deal with, or perhaps you just fancied having a go implementing your own C++ library (understandable, we're all here for similar reasons?)
I'm cross compiling for my OS from Mac and Windows. I want the compiler to use the musl and libcxx I ported to my OS (using my system calls and RPCs) and not the standard libraries for my Mac and Windows hosts.

Am I doing something wrong?
Your cross compiler should already be independent of your host standard libraries. If it is not, then you've done your cross compiling wrong.

I use a bare i686-elf target for my cross compiling, using --sysroot=<dir> as the base directory from which libraries and include files are located.

I haven't yet done an OS specific cross-compiler, as described in:
- OS_Specific_Toolchain
- Hosted_GCC_Cross-Compiler

With an OS specific cross compiler, I could dump (I think) the -sysroot argument, and more easily port existing software.

I suggest you find out why you need those arguments, and fix that.

Perhaps you just need -sysroot.
User avatar
AndrewAPrice
Member
Member
Posts: 2300
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: My ELFs are so big

Post by AndrewAPrice »

thewrongchristian wrote: I haven't yet done an OS specific cross-compiler, as described in:
- OS_Specific_Toolchain
- Hosted_GCC_Cross-Compiler
I didn't follow those guides or create an OS specific confugration of GCC. I just built GCC to include "x86_64-elf" and then my build system includes my ported libcxx and musl, and it works, but I don't believe that's the cause of my large files.
My OS is Perception.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: My ELFs are so big

Post by thewrongchristian »

AndrewAPrice wrote:
thewrongchristian wrote: I haven't yet done an OS specific cross-compiler, as described in:
- OS_Specific_Toolchain
- Hosted_GCC_Cross-Compiler
I didn't follow those guides or create an OS specific confugration of GCC. I just built GCC to include "x86_64-elf" and then my build system includes my ported libcxx and musl, and it works, but I don't believe that's the cause of my large files.
I thought we'd basically established that, the issue being the dependencies being bought in to handle all the details of XML?

Also, there are debug symbols included in the executable. You can strip those. Did that make much difference?

Also, you compile with -O3, which can increase code size by inlining code. You could, for example, turn off inlining to prevent that, or perhaps optimise for size (-Os).

But in the end, are you actually storage space constrained? Debug symbols shouldn't affect runtime memory footprint if you've not got a debugger attached to the binary (the code basically doesn't reference the debug symbols, they won't be loaded in normal running.)
User avatar
AndrewAPrice
Member
Member
Posts: 2300
Joined: Mon Jun 05, 2006 11:00 pm
Location: USA (and Australia)

Re: My ELFs are so big

Post by AndrewAPrice »

I went down the path of whole program optimization. My attempts to use -flto with GCC and binutils ld failed, as I was getting issues with ld saying it's missing extensions. I tried rebuilding binutils multiple times to no success. I could have eventually solved it but in the moment I picked my battles and decided to try out Clang/LLVM.

Well, that was a battle in itself, but an easier one. For example, I learnt that there's no dedicated assembler (I tried llvm-mc but then saw it wasn't producing valid objects files) and that clang is the frontend to LLVM's compiler and assembler (you can even pass --language=assembly-with-cpp -masm=intel to reproduce nasm.) I also learnt clang mangles "int main()" but not "int main(int, char**)". So, small battles. I eventually got it to work.

In the end, a simple GUI program (still linked with Skia, the XML libraries, etc) went from 8-9MB to about 550KB.

As to why I actually care: my OS is built around micro services. 20 running services utilizing half a MB is 10MB. 20 running services utilizing 8MB is 160MB. I guess you can say "buy more RAM" but I want to be a good steward of resources.
My OS is Perception.
Octocontrabass
Member
Member
Posts: 5568
Joined: Mon Mar 25, 2013 7:01 pm

Re: My ELFs are so big

Post by Octocontrabass »

AndrewAPrice wrote:My attempts to use -flto with GCC and binutils ld failed, as I was getting issues with ld saying it's missing extensions.
Last time I set up GCC for LTO, I recall having issues getting GCC's build system to install the LTO plugin. That was a while ago, though, so I don't remember exactly what fixed it.
thewrongchristian
Member
Member
Posts: 426
Joined: Tue Apr 03, 2018 2:44 am

Re: My ELFs are so big

Post by thewrongchristian »

AndrewAPrice wrote:I went down the path of whole program optimization. My attempts to use -flto with GCC and binutils ld failed, as I was getting issues with ld saying it's missing extensions. I tried rebuilding binutils multiple times to no success. I could have eventually solved it but in the moment I picked my battles and decided to try out Clang/LLVM.

Well, that was a battle in itself, but an easier one. For example, I learnt that there's no dedicated assembler (I tried llvm-mc but then saw it wasn't producing valid objects files) and that clang is the frontend to LLVM's compiler and assembler (you can even pass --language=assembly-with-cpp -masm=intel to reproduce nasm.) I also learnt clang mangles "int main()" but not "int main(int, char**)". So, small battles. I eventually got it to work.

In the end, a simple GUI program (still linked with Skia, the XML libraries, etc) went from 8-9MB to about 550KB.
Interesting. What made the difference in the end? Was it just recompiling with clang? Or did you end up stripping the resulting binaries?
AndrewAPrice wrote: As to why I actually care: my OS is built around micro services. 20 running services utilizing half a MB is 10MB. 20 running services utilizing 8MB is 160MB. I guess you can say "buy more RAM" but I want to be a good steward of resources.
Fair enough, but it's unlikely all of that 8MB is actually mapped in. You only need to map in the code that is executed (assuming demand paging), so if the bulk of the files was debug symbols that got removed, then you'll have saved little in the way of memory usage.
Post Reply