Getting the intermediate representation in gcc

Programming, for all ages and all languages.
Post Reply
Srowen
Member
Member
Posts: 60
Joined: Thu Feb 26, 2009 2:31 pm
Location: Genova, ITALY

Getting the intermediate representation in gcc

Post by Srowen »

Is possible to get an intermediate representation of my code compiled with one of the compiler of the gcc suite?

For example, if I have a program written in C and I compile it with gcc, I would like to get an intermediate representation create by the compiler, instead of the elf or object file as output.

I've read the documentation but there isn't a simple option to use to get what I want.

Thanks for the replies!
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: Getting the intermediate representation in gcc

Post by JamesM »

Hi,

AFAIK, there is no easy way to get the GIMPLE IR - this was due to a policy decision by the GCC developers.

Can anyone remember more about that than me?

James
skyking
Member
Member
Posts: 174
Joined: Sun Jan 06, 2008 8:41 am

Re: Getting the intermediate representation in gcc

Post by skyking »

There are various debug options that allow you to produce debugging dumps after various passes, but I'd guess that you have no use for this (otherwise you should have known). The intermediate representation is internal to the compiler and has little practical use outside (beside debugging the compiler) AFAIK.

Maybe you need something else, but without knowledge about what you are trying to do it's hard to tell...
User avatar
xenos
Member
Member
Posts: 1121
Joined: Thu Aug 11, 2005 11:00 pm
Libera.chat IRC: xenos1984
Location: Tartu, Estonia
Contact:

Re: Getting the intermediate representation in gcc

Post by xenos »

By "intermediate" representation, do you mean the assembly code? You can obtain that by using the -S compiler switch when using GCC or by using objdump -d on an ELF / object file.
Programmers' Hardware Database // GitHub user: xenos1984; OS project: NOS
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Getting the intermediate representation in gcc

Post by Solar »

I remember having delved into this sometime around 2001/2002, when I was considering splitting up the GCC frontend and backend to come up with some kind of bytecode / virtual processor architecture. I queried mailing lists about this, probably even gcc-devel or somesuch.

As I was told back then, there is no command-line option or tool to get at the internal representation after the frontend is done. You would have to patch into the GCC sources themselves, and this was discouraged as this representation was considered internal, not too well documented, and subject to change without further notice.

The assembler source generated by the '-S' switch is a backtranslation of that internal representation. Usually, the backend (assembler) gets passed the internal representation directly, which has already been processed by the compiler frontend beyond the point represented by the '-S' ASM source. I.e., even '-S' does not show you a "true intermediate".

That is what I remember from back then. I might remember wrongly, or things might have changed, so take it with a grain of salt.

Edit / PS: Checking up on the Gimple IR, I realized that my talk back then was about the RTL representation. I don't know if the Gimple stage wasn't implemented, or the people I talked to didn't know about it, or if it's worthless as an intermediate, or whatever. I guess you can scrap this whole post. 8)
Every good solution is obvious once you've found it.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: Getting the intermediate representation in gcc

Post by JamesM »

Hi,

GCC has multiple internal representations - RTL, GIMPLE and another, annotated version of GIMPLE. These are internal to GCC and there is no advertised method of exporting it. The reasons for this is politics as mentioned earlier (stopping others from replacing parts of GCC with non-free software, in a nutshell).

GCC (cc1)'s backend outputs textual assembler. 'as' then takes this and assembles it. The -S switch you see with gcc just avoids the 'as' stage. It is not even slightly an intermediate form - it is very much assembler code only.

OP: have you looked into LLVM?

James
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Getting the intermediate representation in gcc

Post by Solar »

JamesM wrote:GCC (cc1)'s backend outputs textual assembler. 'as' then takes this and assembles it. The -S switch you see with gcc just avoids the 'as' stage. It is not even slightly an intermediate form - it is very much assembler code only.
That is about the only part of what I wrote above that I am actually sure of remembering correctly: That the output of "gcc -S" does not represent any state that is used in a "normal" compilation.

Searching...

Ah, I got it. Here it states that the interface between language frontend and backend is the "tree" structure, and that documentation on it is incomplete. It also states that the RTL representation does not have all information about the program.

On the GIMPLE page it states that "The C and C++ front ends currently convert directly from front end trees to GIMPLE, and hand that off to the back end rather than first converting to GENERIC".

Generally speaking, 9 - Passes and Files of the Compiler is probably the best starting point. It's full of "TODO" remarks... no, GCC does not cater to those who want to get at it's intermediates.
Every good solution is obvious once you've found it.
Srowen
Member
Member
Posts: 60
Joined: Thu Feb 26, 2009 2:31 pm
Location: Genova, ITALY

Re: Getting the intermediate representation in gcc

Post by Srowen »

JamesM wrote: OP: have you looked into LLVM?
LLVM seems to be interesting... I read on their site that there is a front-end for java but it is incomplete and there is no documentation. Have you tried it?
fronty
Member
Member
Posts: 188
Joined: Mon Jan 14, 2008 5:53 am
Location: Helsinki

Re: Getting the intermediate representation in gcc

Post by fronty »

Solar wrote:Ah, I got it. Here it states that the interface between language frontend and backend is the "tree" structure, and that documentation on it is incomplete. It also states that the RTL representation does not have all information about the program.
It is normal that front end generates intermediate representation which can be in a tree form and back end generates target language which can be assembly language. IMO your quote doesn't prove that assembly isn't used in normal compilation process.
eddyb
Member
Member
Posts: 248
Joined: Fri Aug 01, 2008 7:52 am

Re: Getting the intermediate representation in gcc

Post by eddyb »

Solar wrote:
JamesM wrote:GCC (cc1)'s backend outputs textual assembler. 'as' then takes this and assembles it. The -S switch you see with gcc just avoids the 'as' stage. It is not even slightly an intermediate form - it is very much assembler code only.
That is about the only part of what I wrote above that I am actually sure of remembering correctly: That the output of "gcc -S" does not represent any state that is used in a "normal" compilation.

Searching...

Ah, I got it. Here it states that the interface between language frontend and backend is the "tree" structure, and that documentation on it is incomplete. It also states that the RTL representation does not have all information about the program.

On the GIMPLE page it states that "The C and C++ front ends currently convert directly from front end trees to GIMPLE, and hand that off to the back end rather than first converting to GENERIC".

Generally speaking, 9 - Passes and Files of the Compiler is probably the best starting point. It's full of "TODO" remarks... no, GCC does not cater to those who want to get at it's intermediates.
This may be a dumb reply, but I saw "You can request to dump a C-like representation of the GIMPLE form with the flag -fdump-tree-gimple." on the GIMPLE page.
This is an output from your everyday forkbomb(couldn't come with a better example):

Code: Select all

// Original code
#include <unistd.h>

int main() {
    while(1)fork();
    return 0;
}

// GIMPLE output
main ()
{
  int D.3461;

  <D.2257>:
  fork ();
  goto <D.2257>;
  D.3461 = 0;
  return D.3461;
}
Post Reply